Shared Flashcard Set

Details

Title

Basic Practice of Statistics

Description

key terms, concepts

Total Cards

Subject

Mathematics

Level

Undergraduate 1

Created

10/23/2008

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Mathematics Flashcards

Cards Return to Set Details

Term

Resistant Measure

Definition

a measure that can resist the influence of extreme observations

e.g Median

Term

Median

Definition

midpoint of a distribution (i.e. the number such that half the observations are smaller and the other half are larger (n+1)/2

Term

Quartiles (Traditional)

Definition

1st Quartile is > 25% of observations
2nd Quartile = median
3rd Quartile is > 75% of observations

Term

Quartiles (Freund/Perles)

Definition

the lower quartile (Q1) is the ¼(n+3)th observation

the second quartile (median) is the ½(n+1)th observation

the upper quartile (Q3) is the ¼(3n+1)th observation

Term

Choosing a Summary (center/spread)

Definition

Five number summary is usually better than mean and standard deviation for a distribution or one with strong outliers

Term

Density Curve

Definition

A curve that has area exactly 1 underneath it. The area under the curve and above any range of values is the proportion of values that fall in that range

Term

Mean of skewed distribution

Definition

The mean of a skewed distribution is pulled toward the long tail

Term

Normal Curve/Distribution

Definition

Symmetric, single-peaked, and bell-shaped

Term

68-95-99.7

Definition

68% of values fall within the 1 std dev from the mean
95% fall within 2 std dev from the mean
99.7% fall within 3 std dev from the mean

Term

Standardization/Z-score

Definition

subtract mean of distribution from value and divide by standard deviation (z-score)

Term

Z-score

Definition

tells is how many standard deviations original value falls away from the mean and in what direction

Term

Standard Normal Distribution

Definition

The normal distribution with mean 0 and standard deviation 1

Term

Behavior of Mean of Skewed Distribution

Definition

Mean moves farther toward long tail for a skewed curve

Term

Five Number Summary

Definition

minimum, Q1, Q2(Median), Q3 Maximum

Term

Behavior of Std Dev

Definition

s is zero when there is no spread and gets larger as spread increases

Term

Standard Deviation

Definition

sq root of the variance

Term

Variance

Definition

sum of individual deviations squared divided by the degrees of freedom (i.e. n-1)

Term

Interquartile Range

Definition

Q3-Q1 (Outlier is 1.5 X IQR above Q3 or below Q1

Term

Response Variable

Definition

Measures outcome of a study

Term

Explanatory Variable

Definition

explains or influences changes in a response variable

Term

Scatterplot

Definition

Plot explanatory variable on x-axis and response variable on the y-axis

Term

Positively Associated

Definition

when above average of one variable tend to accompany above average of the other or below average values tend to occur together

Term

Negatively Associated

Definition

when above average value of one variable accompany below average values of the other and vice versa

Term

Linear Relationship

Definition

when points in a scatter plot lie in a straight line pattern

Term

Correlation

Definition

the sum of the x deviations over std dev of x times the y deviations times 1/n-1

Term

Correlation - fact 1

Definition

Correlation makes no distinction between x and y

Term

Correlation - fact 2

Definition

Because r uses standardized variables r doesn't change when change units of measurement for x and y or both

Term

Correlation - fact 3

Definition

Positive r indicates positive association and negative r indicates negative correlation

Term

Correlation - fact 4

Definition

r is always between -1 and 1 and strength increases as move away from 0 in either direction (r = +-1 points lie on straight line)

Term

Correlation - fact 5

Definition

correlation measure strength of linear relationship only not curved

Term

Correlation - fact 6

Definition

correlation is not resistant i.e. affected by outliers

Term

Regression Line

Definition

a straight line that describes how a response variable changes as an explanatory variable changes

Term

Least-squares regression line

Definition

the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible
slope = r*(sy/sx)
intercepts = y-b*x

Term

Slope and Correlation

Definition

along the regression line a change of one std dev in x corresponds to a change of r std dev in y in other words as correlation grows less strong the prediction moves kess in response to changes in x

Term

r^2

Definition

is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x

Term

Residuals

Definition

The difference between an observed value of the response variable and the value predicted by the regression line
residual = obs y - predicted y

Term

Mean of least-squares residuals

Definition

is always zero

Term

Residual Plot

Definition

a scatterplot of the regression residuals against the explanatory variable

Term

Influential Points

Definition

point in extreme of x direction which has a strong influence on the position of the regression line

Term

Outlier

Definition

observation that lies outside the overall pattern of the other observations

Term

Extrapolation

Definition

the use of a regression line for prediction far outside the range of values of the explanatory variable

Term

Averaged Data

Definition

correlations based on averages are usually too high when applied to individuals

Term

Lurking Variable

Definition

a variable that has an important effect on the relationship among the variables in a study but is not included amont the variables studied

Term

Nonsense Correlations

Definition

changing one of the variables causes changes in the other - usually caused by lurking variable

Term

Association <> Causation

Definition

an association between an explanatory variable and a response variable is not by itself good evidence that changes in x cause changes in y even if that association is strong

Term

Establishing Causation

Definition

Association is strong
Association is consistent
Higher doses are associated with stronger responses
Cause precedes effect in time
Cause is plausible

Term

Two-way Table

Definition

table defining two categorical variables

Term

Marginal Distributions

Definition

row and column totals that appear at right and bottom margins of a two way table

Term

Simpson's Paradox

Definition

an association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group

Term

Observational Study

Definition

observes individuals and measures variables of interest but does not attempt to influence responses e.g. sampling

Term

Experiment

Definition

study that deliberately imposes some treatment on individuals in order to observe their responses

Term

Confounding

Definition

when two variables (explanatory or lurking) effects on a response variable cannot be distinguished from each other

Term

Population

Definition

entire group of individuals we want info about

Term

Sample

Definition

subset of population that we actually examine in order to gather information

Term

Sample Design

Definition

method used to choose sample from population

Term

Voluntary Response Sample

Definition

sample where people choose themselves to respond to a general appeal. biased b/c people with strong opinions-especially negatve ones-are most likely to respond

Term

Convenience Sampling

Definition

sample design that chooses the individuals easiest to reach

Term

Bias

Definition

systematic error; i.e. sample design that favors certain outcomes

Term

Simple Random Sampling

Definition

consists of n indviduals from a population chosen such that every set of n individuals has an equal chance to be selected

Term

Probability Sampling

Definition

sample technique that gives each member of the population a known chance of being selected

Term

Stratified Random Sample

Definition

divides population into groups of similar individuals called strata and then choosing a SRS from each stratum and combining the SRSs to form sample

Term

Strata

Definition

groups of similar individuals within a population used in stratified random sampling

Term

Multi-stage Sampling

Definition

Stage 1: Divide population into groups and select a sample of the groups
Stage 2: divided groups from one into smaller areas called blocks and take a stratified sample from the blocks
Stage 3: Sort individuals from blocks into clusters and take random sample of clusters

Term

Undercoverage

Definition

when some groups in the population are left out of SRS. e.g. phone survey and 6% w/o phones

Term

Nonresponse

Definition

when an individual chosen for the sample can't be contacted or refuses to cooperate

Term

Response Bias

Definition

bias caused by behavior of respondent or interviewer e.g. respondent lying, race or sex of interviewer

Term

Telescoping

Definition

bringing events in the past forward in memory to more recent time periods e.g. saw dentist 8 months ago and say yes to seeing dentist in the last 6 mos.

Term

Wording of Questions

Definition

wording of quesions in sample surveys can introduce bias

Term

Sampling Frame

Definition

list of individuals from which a sample is selected

Term

Experimental Units

Definition

The individuals on which an experiment is done

Term

Subjects

Definition

the experimental units when dealing with human beings

Term

Treatement

Definition

experimental condition applied to the units

Term

Factors

Definition

the explanatory variable(s) in an experiment

Term

Level

Definition

values of the factors in an experimental treatment

Term

Randomization

Definition

use of chance to divide experimental units into groups in an experiment

Term

Randomized Comparative Experiment

Definition

An experiment that uses both comparison and randomization

Term

Completely Randomized

Definition

experimental design where all experimental units are allocated at random among all treatments

Term

Statistically Significant

Definition

An observed effect so large that it would rarely occur by chance

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Mathematics Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile