Shared Flashcard Set

Details

Statistical Program R
Commands for R
67
Psychology
Undergraduate 3
09/24/2014

Additional Psychology Flashcards

 


 

Cards

Term
install.packages("NAME")
Definition

 

How to install packages when you know the name

Term

 

 

require(PACKAGE.NAME)

Definition
command to use a package after it is installed
Term

 

 

 epicalc

Definition

 

 

Package used to determine reliability of groups of items

Term

 

 

foreign

Definition

 

 

 

package to read data entered in other statistical packages, including SPSS

Term

 

 

lmtest

 

Definition

 

 

package to conduct linear regression analyses

Term

 

 

nortest

Definition

 

 

package used to test data for normal distributions.

Term

 

 

outliers

Definition

 

 

package used to test for the presence of outliers in a dataset

Term

 

 

PASWR

Definition

 

 

package used to conduct additional statistical analyses not in the base package (eg., z-tests).

Term

citation()

Definition

 

 

package used to get information about the developer of R

Term

citation("NAME")

Definition

 

used to get information about the developer of a package

Term

GUI preferences

Definition

 

at the bottom of the Edit menu,  used to configure the GUI (graphical user interface) 

Term

 

 

What not to do when creating a variable

Definition

 

Do not start your variable name with a number

Do not name your variable something that is a command in R

Don't forget variable names are case sensitive

Term

 

 

what to do when creating a variable

Definition
Make the names something you can remember and that has meaning
Term

 

 

 An option for entering data in R

Definition

The syntax to create a variable 

name=c(x1,x2,x3,...)

 

After you enter your syntax, press Enter

If you have entered the syntax properly a > sign will appear

to be sure you have entered your data properly simply type the name of the variable

(it will look like this)  [1] x1 x2 x3 

 

 

Term

 

 

Another option for entering data is scan()

Definition

 name=scan() 

if you use space the data will look like this

1: 4 1 7 7 9

if you use enter the data will look like this

1: 4

2: 1

3: 7

4: 7

 

Term

 

 

name=c(name, x1,x2,x3)

Definition

 

 

command you use to enter additional dat after you've created your original variable.

Term

 

 

 

NA

Definition

 

 

what you enter if someone is missing a value for a particular measure in your data set

Term

 

 

.txt or .cvs

Definition

 

 

file extension used for files you want to use in R

 

Save as Tab Delimited Text (.txt)

or

Comma Separated Values (.cvs)

Term

 

 

to import data to R

 

Definition

 

First 

 

use the File menu option  > Change Dir 

Term

 

 

Before importing data

Definition

Create a working file where you data is located. It is similar to creating a variable. Name it something simple like x or one


Term

 

 

to import a file

Definition

 

x=read.table(file="FILENAME.txt",header=T)

x is the name of your working file

read.table tells R you want to import a data file from another program

file="FILENAME.txt" change the FILENAME.txt to your file name.

header=T indicate first row of table has data

Term

 

 

x=read.table(file="FILENAME.cvs",header=T,sep=",")

Definition

To use spreadsheets that aren't in excel, the syntax is similar with two exceptions:

1 the file name now ends in .csv rather than .txt 

and

2 the sep="," syntax at the end. This tells R that the file you stored your data in uses commas to indicate separate values of data

Again x is your name of working file and your file's name for FILENAME

Term

 

 

>attach(NAME)

Definition

 

 

attach tells R you want to use a dataset entered in another program as if the data were entered directly into R.

Term

 

 

t.test

Definition

 

to run a t-test if we attach our data syntax is

t.test(variable1~variable2,var.equal=T)

if we do not attach data syntax is

t.test(x[['variable1']]~[['variable2']],var.equal=T)

by using attach command we can use the 1st t.test syntax because R knows where the data is located

Term

 

data.entry(NAME)

 

 

 

 

Definition

How to edit data that has been directly entered into R


data.entry- opens a spreadsheet containing the data stored

(NAME) - specifies which variable you want to edit

 

Term

 

 

fix(NAME)

Definition

 

opens up a test file with your data if you want to change an entry but only one varibale can be changed using the

fix command

 

if you realize you have made a mistake entering your data into another spreadsheet, it is strongly recommended you edit the spreadsheet and reimport the data into R

Term

 

 

car

Definition

 

 

package required to recode data

Term

 

 

Recoding data

Definition

 

 

Is used when you need to use reverse scoring.

Term

 

 

require(car)

Definition

 

 

command to load car package for use

Term

 

 

NEWVARNAME=recode(OLDVARNAME,"recodes")

Definition

NEWVARNAME - creates new variable that contains the reversed scores

recode - tells R that you want to create different values using information contained in parentheses

OLDVARNAME - the actual name of your variable you are reversing

"recodes" - contains the information on how to recode your original data in OLDVARNAME

Term

 

 

NEWVARNAME=recode(OLDVARNAME,"0='3';1='2';2='1';3='0'"

Definition

 

 

syntax for recode is contained inside one set of quotation marks-"", each pair of data is separated by a semicolon-;, and original number followed by equal sign-=, then new value is enclosed with single quote marks-'

Term

 

 

Categorial Variables

Definition

 

 

for instance depressed and non depressed

 

 

Term

 

 

total=c(x1+x2+x3+x4 . . . x20)

Definition

 

to creat a variable that totals scores

Term

 

 

 

lo

hi

else

Definition

for categorical variable

tells R to start recording range at the lowest, observed score

for instance 

categoryname=recode(totalname,"lo:16='0';17:hi-'1'")

this will transform all scores in the totalname into 0's and 1's and store in categoryname

 

Term

 

lo

hi

else

Definition

categoryname=recode(totalname,"lo:16='Not Depressed';17:hi='Depressed'")

 

follows the same format as reverse scoring. 

Term

 

 

name=scan()

Definition

 

name - is what you want to call your variable.

 

When you want to enter data without haveing to enter comma to separate values. 

Term

 

 

 

hist(NAME)

Definition

 

 

command to make a histograms

Term

 

 

brk=c(x50,x55,x60,x65,x70,x75,x80,x85,x90,x95,x100)

Definition

 1.to create a variable that contains each of the values you

want on the x-axis1.

 2.to save time you could use the command 

brk=c(seq(50,100,5))

 seq function tells R to crate a sequence of numbers using the information in the ( ). 1st and 2nd numbers are min and max values and final number tells how far apart you want the sequence to be. 

 

 

Term

 

 

 

hist(test.pct,breaks=brk)

Definition

 

Command you can use once you crate your break variable.

Term

 

 

hist(test.pct,breaks=brk)

Definition

 

Once you create your break bariable this is the syntax you enter to tell R that you want a different set of values on the x-axix, and tells R where to find these values.

Term

 

 

freq

Definition

 

 

to change the data reported y-axis

Term

 

 

freq=T

Definition

 

 

Tells that you want the absolute frequencies listed on the y-axis

 

 

Term

 

 

freq=F

Definition

 

 

This tells R you want the density (not the frequency)

Term

 

 

hist(test.pct,breaks=brk,freq=T)

Definition

 

 

syntax to put absolute values on the y-axis

Term

 

 

hist(test.pct,breaks=brk,freq=F)

Definition

 

 

syntax used to tell R that you want the density on the

y-axis

Term

 

 

boxplot(test.pct)

Definition

 

 

to create a boxplot simply use the command boxplot followed by the name of the variable containing the data we wish to look at. 

Term

 

 

qqnorm(test.pct)

 

Definition

 

 

 

another option to check normality of your datais the Q-Q plot. This plot involves plotting observed vs. theoretical quanities on the x-and y-axes (hence Q-Q)

Term

 

 

plot(Y~X)

Definition

 

plot -  tells R to create a scatterplot

 

Y - is the outcome variable (replace Y with name of the variable where your outcome data is stored)

X - is the predicator variable (replace X with the name of the variable wher your predictor is stored)

Term

 

 

 

summary(NAME)

Definition

This option will provide a wide range of information 

It will provide you with the

 

Min and Max scores 

Mean and Median

1stQu and 3rdQu (1st and 3rd Quartiles) 

Term

 

 

 

mean(NAME)

Definition

 

 

tells R to create a histogram with the data of a given variable

 

(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable)

Term

 

 

median(NAME)

Definition

 

 

tells R to report the middle score for a given variable

 

(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable)

Term

 

 

var(NAME)

Definition

 

 

 

this is the syntax used to produce information about the variance

(This is information that is not included in the output produced by the summary opiton)

Term

 

 

sd(NAME)

Definition

 

 

this is the syntax used to produce the information about the standard deviation

 

(This is information that is not included in the output produced by the summary opiton)

 

Term

 

 

na.rm=T

Definition

  command used to handle missing data points, used in conjuction with the 

 

mean(NAME,na.rm=T)

or

sd(NAME,na.rm=T)

Term

 

 

tapply(NAME,group.var,desc.stat)

Definition

tapply - tells R that you want informaton from a variable broken into groups

NAME - the name of the variable you want decriptive information on (replace with actual name of the variable where the data is stored)

group.var - the name of the variable containing the group information(replace with actual name...)

desc.stat - the specific descriptive statistic you want reported(replace with actual name...)

Term

 

 

F or f

Definition

 

it is recommended if you ever want to create a factor for a variable you should use F as in FNAME or f as in fsex.

Term

 

 

 

FNAME=factor(Name,labels=c("label1","label2"))

Definition

FNAME - is the factor name for a give numeric variable(replace with actual factor name)

factor - tells R you want to create a factor for a variable

NAME - specifies where the numeric data is(are) from (replace with actual numerical variable name)

labels=c - tells R that you are defining the labels for each numerical value.

"label1","label2" - labels for each numerical value (replace with your desired labels.)

 

Term

 

 

 

cor(x,y)

Definition

 

 

syntac to calculate Pearson's r is fount in the stats package 

cor - tells R to report th ecorrelation coeffieient between 2 numeric variables

x - is the predictor variable (use actual name of your predictor variable)

y - is the outcome variable (use actual name of your outcome variable)

Term

 

 

cor.test(x,y)

Definition

 

 

to find t-, p-value and additional information

Term

 

 

lm(y~x)

Definition

 This is the syntax used to find linear regression

the main difference is cor.test is replaced with 

lm - tells R that we are interested in constructing a linear mofel for our data.

although x and y have the same meaning in this syntax, they are entered in reverse order and separated by a ~ (tilde)

 

lm command is located in the stats package. 

Term

 

 

 

summary(lm(y~x)

Definition

Used to get a more complete report of your analysis

 

At the bottom of output will be

 

F-statistic: x.xxx on X and X DF, p-value:  z.zzzz


x,X, and z are really numbers

Term

 

 

p-value

Definition

 

Remember if your p-value is lower thatn .05, the model is a good fit. 

 

If higher than .05 then one should accept the null hypothesis

Supporting users have an ad free experience!