Term
|
Definition
How to install packages when you know the name |
|
|
Term
|
Definition
command to use a package after it is installed |
|
|
Term
|
Definition
Package used to determine reliability of groups of items |
|
|
Term
|
Definition
package to read data entered in other statistical packages, including SPSS |
|
|
Term
|
Definition
package to conduct linear regression analyses |
|
|
Term
|
Definition
package used to test data for normal distributions. |
|
|
Term
|
Definition
package used to test for the presence of outliers in a dataset |
|
|
Term
|
Definition
package used to conduct additional statistical analyses not in the base package (eg., z-tests). |
|
|
Term
|
Definition
package used to get information about the developer of R |
|
|
Term
|
Definition
used to get information about the developer of a package |
|
|
Term
|
Definition
at the bottom of the Edit menu, used to configure the GUI (graphical user interface) |
|
|
Term
What not to do when creating a variable |
|
Definition
Do not start your variable name with a number
Do not name your variable something that is a command in R
Don't forget variable names are case sensitive |
|
|
Term
what to do when creating a variable |
|
Definition
Make the names something you can remember and that has meaning |
|
|
Term
An option for entering data in R |
|
Definition
The syntax to create a variable
name=c(x1,x2,x3,...)
After you enter your syntax, press Enter
If you have entered the syntax properly a > sign will appear
to be sure you have entered your data properly simply type the name of the variable
(it will look like this) [1] x1 x2 x3
|
|
|
Term
Another option for entering data is scan() |
|
Definition
name=scan()
if you use space the data will look like this
1: 4 1 7 7 9
if you use enter the data will look like this
1: 4
2: 1
3: 7
4: 7
|
|
|
Term
|
Definition
command you use to enter additional dat after you've created your original variable. |
|
|
Term
|
Definition
what you enter if someone is missing a value for a particular measure in your data set |
|
|
Term
|
Definition
file extension used for files you want to use in R
Save as Tab Delimited Text (.txt)
or
Comma Separated Values (.cvs) |
|
|
Term
|
Definition
First
use the File menu option > Change Dir |
|
|
Term
|
Definition
Create a working file where you data is located. It is similar to creating a variable. Name it something simple like x or one
|
|
|
Term
|
Definition
x=read.table(file="FILENAME.txt",header=T)
x is the name of your working file
read.table tells R you want to import a data file from another program
file="FILENAME.txt" change the FILENAME.txt to your file name.
header=T indicate first row of table has data |
|
|
Term
x=read.table(file="FILENAME.cvs",header=T,sep=",") |
|
Definition
To use spreadsheets that aren't in excel, the syntax is similar with two exceptions:
1 the file name now ends in .csv rather than .txt
and
2 the sep="," syntax at the end. This tells R that the file you stored your data in uses commas to indicate separate values of data
Again x is your name of working file and your file's name for FILENAME |
|
|
Term
|
Definition
attach tells R you want to use a dataset entered in another program as if the data were entered directly into R. |
|
|
Term
|
Definition
to run a t-test if we attach our data syntax is
t.test(variable1~variable2,var.equal=T)
if we do not attach data syntax is
t.test(x[['variable1']]~[['variable2']],var.equal=T)
by using attach command we can use the 1st t.test syntax because R knows where the data is located |
|
|
Term
|
Definition
How to edit data that has been directly entered into R
data.entry- opens a spreadsheet containing the data stored
(NAME) - specifies which variable you want to edit
|
|
|
Term
|
Definition
opens up a test file with your data if you want to change an entry but only one varibale can be changed using the
fix command
if you realize you have made a mistake entering your data into another spreadsheet, it is strongly recommended you edit the spreadsheet and reimport the data into R |
|
|
Term
|
Definition
package required to recode data |
|
|
Term
|
Definition
Is used when you need to use reverse scoring. |
|
|
Term
|
Definition
command to load car package for use |
|
|
Term
NEWVARNAME=recode(OLDVARNAME,"recodes") |
|
Definition
NEWVARNAME - creates new variable that contains the reversed scores
recode - tells R that you want to create different values using information contained in parentheses
OLDVARNAME - the actual name of your variable you are reversing
"recodes" - contains the information on how to recode your original data in OLDVARNAME |
|
|
Term
NEWVARNAME=recode(OLDVARNAME,"0='3';1='2';2='1';3='0'" |
|
Definition
syntax for recode is contained inside one set of quotation marks-"", each pair of data is separated by a semicolon-;, and original number followed by equal sign-=, then new value is enclosed with single quote marks-' |
|
|
Term
|
Definition
for instance depressed and non depressed
|
|
|
Term
total=c(x1+x2+x3+x4 . . . x20) |
|
Definition
to creat a variable that totals scores |
|
|
Term
|
Definition
for categorical variable
tells R to start recording range at the lowest, observed score
for instance
categoryname=recode(totalname,"lo:16='0';17:hi-'1'")
this will transform all scores in the totalname into 0's and 1's and store in categoryname
|
|
|
Term
|
Definition
categoryname=recode(totalname,"lo:16='Not Depressed';17:hi='Depressed'")
follows the same format as reverse scoring. |
|
|
Term
|
Definition
name - is what you want to call your variable.
When you want to enter data without haveing to enter comma to separate values. |
|
|
Term
|
Definition
command to make a histograms |
|
|
Term
brk=c(x50,x55,x60,x65,x70,x75,x80,x85,x90,x95,x100) |
|
Definition
1.to create a variable that contains each of the values you
want on the x-axis1.
2.to save time you could use the command
brk=c(seq(50,100,5))
seq function tells R to crate a sequence of numbers using the information in the ( ). 1st and 2nd numbers are min and max values and final number tells how far apart you want the sequence to be.
|
|
|
Term
hist(test.pct,breaks=brk) |
|
Definition
Command you can use once you crate your break variable. |
|
|
Term
hist(test.pct,breaks=brk) |
|
Definition
Once you create your break bariable this is the syntax you enter to tell R that you want a different set of values on the x-axix, and tells R where to find these values. |
|
|
Term
|
Definition
to change the data reported y-axis |
|
|
Term
|
Definition
Tells R that you want the absolute frequencies listed on the y-axis
|
|
|
Term
|
Definition
This tells R you want the density (not the frequency) |
|
|
Term
hist(test.pct,breaks=brk,freq=T) |
|
Definition
syntax to put absolute values on the y-axis |
|
|
Term
hist(test.pct,breaks=brk,freq=F) |
|
Definition
syntax used to tell R that you want the density on the
y-axis |
|
|
Term
|
Definition
to create a boxplot simply use the command boxplot followed by the name of the variable containing the data we wish to look at. |
|
|
Term
|
Definition
another option to check normality of your datais the Q-Q plot. This plot involves plotting observed vs. theoretical quanities on the x-and y-axes (hence Q-Q) |
|
|
Term
|
Definition
plot - tells R to create a scatterplot
Y - is the outcome variable (replace Y with name of the variable where your outcome data is stored)
X - is the predicator variable (replace X with the name of the variable wher your predictor is stored) |
|
|
Term
|
Definition
This option will provide a wide range of information
It will provide you with the
Min and Max scores
Mean and Median
1stQu and 3rdQu (1st and 3rd Quartiles) |
|
|
Term
|
Definition
tells R to create a histogram with the data of a given variable
(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable) |
|
|
Term
|
Definition
tells R to report the middle score for a given variable
(NAME) specifies the variable to be analyzed (you will need to change NAME to the name of the variable) |
|
|
Term
|
Definition
this is the syntax used to produce information about the variance
(This is information that is not included in the output produced by the summary opiton) |
|
|
Term
|
Definition
this is the syntax used to produce the information about the standard deviation
(This is information that is not included in the output produced by the summary opiton)
|
|
|
Term
|
Definition
command used to handle missing data points, used in conjuction with the
mean(NAME,na.rm=T)
or
sd(NAME,na.rm=T) |
|
|
Term
tapply(NAME,group.var,desc.stat) |
|
Definition
tapply - tells R that you want informaton from a variable broken into groups
NAME - the name of the variable you want decriptive information on (replace with actual name of the variable where the data is stored)
group.var - the name of the variable containing the group information(replace with actual name...)
desc.stat - the specific descriptive statistic you want reported(replace with actual name...) |
|
|
Term
|
Definition
it is recommended if you ever want to create a factor for a variable you should use F as in FNAME or f as in fsex. |
|
|
Term
FNAME=factor(Name,labels=c("label1","label2")) |
|
Definition
FNAME - is the factor name for a give numeric variable(replace with actual factor name)
factor - tells R you want to create a factor for a variable
NAME - specifies where the numeric data is(are) from (replace with actual numerical variable name)
labels=c - tells R that you are defining the labels for each numerical value.
"label1","label2" - labels for each numerical value (replace with your desired labels.)
|
|
|
Term
|
Definition
syntac to calculate Pearson's r is fount in the stats package
cor - tells R to report th ecorrelation coeffieient between 2 numeric variables
x - is the predictor variable (use actual name of your predictor variable)
y - is the outcome variable (use actual name of your outcome variable) |
|
|
Term
|
Definition
to find t-, p-value and additional information |
|
|
Term
|
Definition
This is the syntax used to find linear regression
the main difference is cor.test is replaced with
lm - tells R that we are interested in constructing a linear mofel for our data.
although x and y have the same meaning in this syntax, they are entered in reverse order and separated by a ~ (tilde)
lm command is located in the stats package. |
|
|
Term
|
Definition
Used to get a more complete report of your analysis
At the bottom of output will be
F-statistic: x.xxx on X and X DF, p-value: z.zzzz
x,X, and z are really numbers
|
|
|
Term
|
Definition
Remember if your p-value is lower thatn .05, the model is a good fit.
If higher than .05 then one should accept the null hypothesis |
|
|