Shared Flashcard Set

Details

Title

Statistical Program R

Description

Commands for R

Total Cards

Subject

Psychology

Level

Undergraduate 3

Created

09/24/2014

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Psychology Flashcards

Cards Return to Set Details

Term

install.packages("NAME")

Definition

How to install packages when you know the name

Term

require(PACKAGE.NAME)

Definition

command to use a package after it is installed

Term

epicalc

Definition

Package used to determine reliability of groups of items

Term

foreign

Definition

package to read data entered in other statistical packages, including SPSS

Term

lmtest

Definition

package to conduct linear regression analyses

Term

nortest

Definition

package used to test data for normal distributions.

Term

outliers

Definition

package used to test for the presence of outliers in a dataset

Term

PASWR

Definition

package used to conduct additional statistical analyses not in the base package (eg., z-tests).

Term

citation()

Definition

package used to get information about the developer of R

Term

citation("NAME")

Definition

used to get information about the developer of a package

Term

GUI preferences

Definition

at the bottom of the Edit menu, used to configure the GUI (graphical user interface)

Term

What not to do when creating a variable

Definition

Do not start your variable name with a number

Do not name your variable something that is a command in R

Don't forget variable names are case sensitive

Term

what to do when creating a variable

Definition

Make the names something you can remember and that has meaning

Term

An option for entering data in R

Definition

The syntax to create a variable

name=c(x1,x2,x3,...)

After you enter your syntax, press Enter

If you have entered the syntax properly a > sign will appear

to be sure you have entered your data properly simply type the name of the variable

(it will look like this) [1] x1 x2 x3

Term

Another option for entering data is scan()

Definition

name=scan()

if you use space the data will look like this

1: 4 1 7 7 9

if you use enter the data will look like this

1: 4

2: 1

3: 7

4: 7

Term

name=c(name, x1,x2,x3)

Definition

command you use to enter additional dat after you've created your original variable.

Term

Definition

what you enter if someone is missing a value for a particular measure in your data set

Term

.txt or .cvs

Definition

file extension used for files you want to use in R

Save as Tab Delimited Text (.txt)

or

Comma Separated Values (.cvs)

Term

to import data to R

Definition

First

use the File menu option > Change Dir

Term

Before importing data

Definition

Create a working file where you data is located. It is similar to creating a variable. Name it something simple like x or one

Term

to import a file

Definition

x=read.table(file="FILENAME.txt",header=T)

x is the name of your working file

read.table tells R you want to import a data file from another program

file="FILENAME.txt" change the FILENAME.txt to your file name.

header=T indicate first row of table has data

Term

x=read.table(file="FILENAME.cvs",header=T,sep=",")

Definition

To use spreadsheets that aren't in excel, the syntax is similar with two exceptions:

1 the file name now ends in .csv rather than .txt

and

2 the sep="," syntax at the end. This tells R that the file you stored your data in uses commas to indicate separate values of data

Again x is your name of working file and your file's name for FILENAME

Term

>attach(NAME)

Definition

attach tells R you want to use a dataset entered in another program as if the data were entered directly into R.

Term

t.test

Definition

to run a t-test if we attach our data syntax is

t.test(variable1~variable2,var.equal=T)

if we do not attach data syntax is

t.test(x[['variable1']]~[['variable2']],var.equal=T)

by using attach command we can use the 1st t.test syntax because R knows where the data is located

Term

data.entry(NAME)

Definition

How to edit data that has been directly entered into R

data.entry- opens a spreadsheet containing the data stored

(NAME) - specifies which variable you want to edit

Term

fix(NAME)

Definition

opens up a test file with your data if you want to change an entry but only one varibale can be changed using the

fix command

if you realize you have made a mistake entering your data into another spreadsheet, it is strongly recommended you edit the spreadsheet and reimport the data into R

Term

car

Definition

package required to recode data

Term

Recoding data

Definition

Is used when you need to use reverse scoring.

Term

require(car)

Definition

command to load car package for use

Term

NEWVARNAME=recode(OLDVARNAME,"recodes")

Definition

NEWVARNAME - creates new variable that contains the reversed scores

recode - tells R that you want to create different values using information contained in parentheses

OLDVARNAME - the actual name of your variable you are reversing

"recodes" - contains the information on how to recode your original data in OLDVARNAME

Term

NEWVARNAME=recode(OLDVARNAME,"0='3';1='2';2='1';3='0'"

Definition

syntax for recode is contained inside one set of quotation marks-"", each pair of data is separated by a semicolon-;, and original number followed by equal sign-=, then new value is enclosed with single quote marks-'

Term

Categorial Variables

Definition

for instance depressed and non depressed

Term

total=c(x1+x2+x3+x4 . . . x20)

Definition

to creat a variable that totals scores

Term

lo

hi

else

Definition

for categorical variable

tells R to start recording range at the lowest, observed score

for instance

categoryname=recode(totalname,"lo:16='0';17:hi-'1'")

this will transform all scores in the totalname into 0's and 1's and store in categoryname

Term

lo

hi

else

Definition

categoryname=recode(totalname,"lo:16='Not Depressed';17:hi='Depressed'")

follows the same format as reverse scoring.

Term

name=scan()

Definition

name - is what you want to call your variable.

When you want to enter data without haveing to enter comma to separate values.

Term

hist(NAME)

Definition

command to make a histograms

Term

brk=c(x50,x55,x60,x65,x70,x75,x80,x85,x90,x95,x100)

Definition

1.to create a variable that contains each of the values you

want on the x-axis¹.

2.to save time you could use the command

brk=c(seq(50,100,5))

seq function tells R to crate a sequence of numbers using the information in the ( ). 1st and 2nd numbers are min and max values and final number tells how far apart you want the sequence to be.

Term

hist(test.pct,breaks=brk)

Definition

Command you can use once you crate your break variable.

Term

hist(test.pct,breaks=brk)

Definition

Once you create your break bariable this is the syntax you enter to tell R that you want a different set of values on the x-axix, and tells R where to find these values.

Term

freq

Definition

to change the data reported y-axis

Term

freq=T

Definition

Tells R that you want the absolute frequencies listed on the y-axis

Term

freq=F

Definition

This tells R you want the density (not the frequency)

Term

hist(test.pct,breaks=brk,freq=T)

Definition

syntax to put absolute values on the y-axis

Term

hist(test.pct,breaks=brk,freq=F)

Definition

syntax used to tell R that you want the density on the

y-axis

Term

boxplot(test.pct)

Definition

to create a boxplot simply use the command boxplot followed by the name of the variable containing the data we wish to look at.

Term

qqnorm(test.pct)

Definition

another option to check normality of your datais the Q-Q plot. This plot involves plotting observed vs. theoretical quanities on the x-and y-axes (hence Q-Q)

Term

plot(Y~X)

Definition

plot - tells R to create a scatterplot

Y - is the outcome variable (replace Y with name of the variable where your outcome data is stored)

X - is the predicator variable (replace X with the name of the variable wher your predictor is stored)

Term

summary(NAME)

Definition

This option will provide a wide range of information

It will provide you with the

Min and Max scores

Mean and Median

1^stQu and 3^rdQu (1st and 3rd Quartiles)

Term

mean(NAME)

Definition

tells R to create a histogram with the data of a given variable

(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable)

Term

median(NAME)

Definition

tells R to report the middle score for a given variable

(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable)

Term

var(NAME)

Definition

this is the syntax used to produce information about the variance

(This is information that is not included in the output produced by the summary opiton)

Term

sd(NAME)

Definition

this is the syntax used to produce the information about the standard deviation

(This is information that is not included in the output produced by the summary opiton)

Term

na.rm=T

Definition

command used to handle missing data points, used in conjuction with the

mean(NAME,na.rm=T)

or

sd(NAME,na.rm=T)

Term

tapply(NAME,group.var,desc.stat)

Definition

tapply - tells R that you want informaton from a variable broken into groups

NAME - the name of the variable you want decriptive information on (replace with actual name of the variable where the data is stored)

group.var - the name of the variable containing the group information(replace with actual name...)

desc.stat - the specific descriptive statistic you want reported(replace with actual name...)

Term

F or f

Definition

it is recommended if you ever want to create a factor for a variable you should use F as in FNAME or f as in fsex.

Term

FNAME=factor(Name,labels=c("label1","label2"))

Definition

FNAME - is the factor name for a give numeric variable(replace with actual factor name)

factor - tells R you want to create a factor for a variable

NAME - specifies where the numeric data is(are) from (replace with actual numerical variable name)

labels=c - tells R that you are defining the labels for each numerical value.

"label1","label2" - labels for each numerical value (replace with your desired labels.)

Term

cor(x,y)

Definition

syntac to calculate Pearson's r is fount in the stats package

cor - tells R to report th ecorrelation coeffieient between 2 numeric variables

x - is the predictor variable (use actual name of your predictor variable)

y - is the outcome variable (use actual name of your outcome variable)

Term

cor.test(x,y)

Definition

to find t-, p-value and additional information

Term

lm(y~x)

Definition

This is the syntax used to find linear regression

the main difference is cor.test is replaced with

lm - tells R that we are interested in constructing a linear mofel for our data.

although x and y have the same meaning in this syntax, they are entered in reverse order and separated by a ~ (tilde)

lm command is located in the stats package.

Term

summary(lm(y~x)

Definition

Used to get a more complete report of your analysis

At the bottom of output will be

F-statistic: x.xxx on X and X DF, p-value: z.zzzz

x,X, and z are really numbers

Term

p-value

Definition

Remember if your p-value is lower thatn .05, the model is a good fit.

If higher than .05 then one should accept the null hypothesis

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Psychology Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile