Term
|
Definition
- Dependent Variable
- outcome variable on which comparisons are made
|
|
|
Term
|
Definition
- Independent variable
- defines groups to be compared to response variable in regards to values
- ex. response / explanatory
- Blood alcohol level/ bears consumed
- Grade on test/ Amound of study time
- Yield of corn per bushel / amount of rainfall
|
|
|
Term
|
Definition
- exists between 2 variables if particular value for one variable is more likely to occur with certain values of the other variable
|
|
|
Term
|
Definition
- Display 2 categorical variables
- Row lists categories of one variable
- Columns list categories of other variable
- Entires are frequencies
|
|
|
Term
|
Definition
- Proportions dependend on variable
- EX. proportion of organic food w/ pesticide and proportion of conventional food w/ pesticide
|
|
|
Term
3 types of cases while investigating association between 2 variables |
|
Definition
- Both categorical (ex. food type / pesticide status)
- One Quantitative / One categorical
- Both quantitative
|
|
|
Term
Constructing scatterplots |
|
Definition
- Horizontal axis: explanatory variable X
- Vertical Axis: response variable Y
|
|
|
Term
|
Definition
- High values of X tend to occur w/ high values of Y
- Low values of x tend to occur with low values of Y
|
|
|
Term
|
Definition
- High values of one variable tend to pair w/ low vaules of the other variable
|
|
|
Term
|
Definition
- Measure strength and direction o linear association btwn X and Y
- Positive r value = positive association
- Negative r value = negative association
- r value close to +/- 1 = strong linear association
- r value close to 0 = weak linear association (nonlinear relationship may exist)
|
|
|
Term
Properties of Correlation |
|
Definition
- Btwn +/- 1
- +1 = positive linear association
- -1 = negative linear association
- Unit-less measure
- 2 variables have same correlation nomatter which is treated as response variable
- not resistant to outliers
- does not depend on variable units
- only measures strength of linear relationship
[image] |
|
|
Term
|
Definition
- graph divided into 4 regions
- Horizontal lines at mean of y
- Vertical line at mea of x
|
|
|
Term
|
Definition
- Y denote response variable
- X denote explanatory variable
- straight line that describes how response variable changes as explanatory variable changes
- prediction response variable given explanatory variable
|
|
|
Term
|
Definition
- measures size of prediction errors
- vertical distance btwn point and regression line
- Y- Y^ <---(y hat)
- Large residual = unusual observation
|
|
|
Term
Least Squares Method
Yields the regression line |
|
Definition
- Σ(residuals)2
- Least squares regression line = line that minimizes vertical distance btwn points and their predictions
- Slope = r( Sy / Sx )
- Y-Intercept = (Average of Y) - b * (Average of X)
|
|
|
Term
|
Definition
- Regression equation prediction better when there is a strong linear association
- Better than using only sample mean
- proportional reduction inerror = r2
- Measures proportion of variation in y-values that is accounted by linear relationship of y w/ x
- ex. correlation of 0.9 means 81% of variation in y-values cna be explained by explanatory variable
|
|
|
Term
|
Definition
- Using regression line to predict y-values for x-values outside observed range of data
|
|
|
Term
|
Definition
- Observation that lies far away from trend that rest of data follow
|
|
|
Term
Observation is influential if... |
|
Definition
- x value is relatively low/high compared to remaineder of data
- observation is a regression outlier
- influential observations tend to pull regression line toward that data point and away from rest.
|
|
|
Term
|
Definition
- Unobserved variable that influences association btwn variables of primary interest
- Not measured in study
|
|
|
Term
|
Definition
- 2 explanatory variables both associated w/ a response variable but also associated with each other
|
|
|
Term
|
Definition
- When direction of association between 2 variables changes after including 3rd variable and analyzing data at separte leves of that variable
|
|
|