Term
Bivariate descriptive statistics |
|
Definition
are used to describe relationships between two variables Examples: Height and weight Smoking status and lung cancer incidence |
|
|
Term
Appropriate statistic depends on the |
|
Definition
variables’ level of measurement |
|
|
Term
|
Definition
Researchers crosstabulate the frequencies of all categories of two variables in a two-dimensional frequency distribution |
|
|
Term
Crosstabulated variables should be |
|
Definition
nominal level (or ordinal level with a small number of categories) |
|
|
Term
|
Definition
have been developed to describe risk outcomes and facilitate clinical decision making |
|
|
Term
Risk indexes capture two aspects of the effects of risk exposure |
|
Definition
absolute and relative risk |
|
|
Term
|
Definition
Indexes quantify the actual amount of risk related to different exposures |
|
|
Term
|
Definition
Indexes compare risks in the two risk exposure groups |
|
|
Term
|
Definition
is the proportion of people with a negative outcome |
|
|
Term
|
Definition
is the absolute difference between the two risk groups |
|
|
Term
|
Definition
is the ratio of absolute risks (adverse outcomes) in the two groups |
|
|
Term
|
Definition
is the proportion of people in each risk group who have the adverse outcome, relative to the proportion who do not |
|
|
Term
In many cases, the value of RR and OR |
|
Definition
|
|
Term
|
Definition
is a bond or connection between variables |
|
|
Term
Correlations between two quantitative variables can be graphed in |
|
Definition
|
|
Term
|
Definition
graphs the values of one variable on the X axis and the values of the second one on the Y axis of a graph |
|
|
Term
A scatterplot indicates whether the variables have a |
|
Definition
linear relationship with each other |
|
|
Term
|
Definition
direction and magnitude of the relationship |
|
|
Term
|
Definition
Sometimes data points are not linearly related—they are positively or negatively correlated, but only up to a point, then the relationship changes |
|
|
Term
A correlation coefficient |
|
Definition
is a statistic that summarizes the magnitude and direction of relationships between two variables |
|
|
Term
Most widely used correlation coefficient |
|
Definition
|
|
Term
|
Definition
is computed with variables that are interval- or ratio-level measures |
|
|
Term
1.00 = Perfect positive relationship E.g., a flat $1 tax for every $5 earned .35 = Weak/moderate positive relationship E.g., nurses’ degree of autonomy and job satisfaction (those with more autonomy are somewhat more satisfied) .00 = No relationship E.g., nurses’ degree of autonomy and height (tall and short nurses equally autonomous) -.20 = Weak negative relationship E.g., diabetic knowledge and a person’s age (older people are somewhat less knowledgeable) -.70 = Strong negative relationship E.g., levels of depression and life satisfaction (those with high levels of depression have lower life satisfaction) |
|
Definition
|
|
Term
A correlation between two variables never implies |
|
Definition
that one variable caused the other |
|
|
Term
|
Definition
for coming to conclusions about what is probably true in a population, based on sample values |
|
|
Term
|
Definition
statistics—for describing samples |
|
|
Term
Inferential statistics uses the __________ to provide guidance on what is probably true |
|
Definition
|
|
Term
|
Definition
is that the deck is fair—not “rigged” |
|
|
Term
|
Definition
|
|
Term
Probability distributions |
|
Definition
are similar to frequency polygons (or histograms) |
|
|
Term
They graph the probabilities of |
|
Definition
all events that could occur |
|
|
Term
Probability density function = |
|
Definition
Probability distribution for continuous variables Example: A distribution of IQ scores for a population of 10,000 10-year-old children Population mean = 100.0 = μ Population SD = 15.0 = σ |
|
|
Term
A ________________ is the distribution of an infinite number of sample means from the population, for samples of a given size |
|
Definition
sampling distribution of the mean |
|
|
Term
a mathematic formulation, shows that the mean of a sampling distribution of the mean always equals the population mean |
|
Definition
The central limit theorem |
|
|
Term
If population values are normally distributed |
|
Definition
so is the sampling distribution of the mean |
|
|
Term
The ________ is the standard deviation of a theoretical sampling distribution |
|
Definition
standard error of the mean (SEM) |
|
|
Term
|
Definition
the less likely it is that a sample mean is a good estimate of the population mean |
|
|
Term
|
Definition
known, but can be estimated |
|
|
Term
|
Definition
the samples’ standard deviation |
|
|
Term
Statistical Inference Approaches Two basic approaches |
|
Definition
Parameter estimation Hypothesis testing |
|
|
Term
|
Definition
is used to estimate a population value—e.g., a mean, percentage, or odds ratio |
|
|
Term
|
Definition
A point estimate An interval estimate |
|
|
Term
A ________ involves the calculation of a single value as the estimate of the parameter |
|
Definition
|
|
Term
A point estimate is thus simply the |
|
Definition
value of the descriptive statistic, like a mean |
|
|
Term
An _______________ provides a range of values within which the population value has a specified probability of lying |
|
Definition
|
|
Term
Interval estimation involves constructing ___________ around the point estimate |
|
Definition
|
|
Term
A 95% _________ designates the range of values within which the parameter has a 95% probability of lying |
|
Definition
confidence interval (95% CI) |
|
|
Term
Constructing a CI involves calculating |
|
Definition
confidence limits (the upper and lower limit of what is probable, at the specified probability level) |
|
|
Term
The most commonly reported CIs are |
|
Definition
|
|
Term
The ___________ is similar to a normal distribution—bell shaped and symmetric |
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
computed around proportions/percentages and risk indexes like Relative Risk and the Odds Ratio |
|
|
Term
The theoretical distribution for constructing CIs in these scenarios is the |
|
Definition
|
|
Term
CIs around proportions and risk indexes are rarely |
|
Definition
|
|
Term
Like the CI around a mean, the larger the sample size |
|
Definition
|
|
Term
|
Definition
|
|
Term
(second broad approach to statistical inference) uses laws of probability to help researchers make objective decisions about accepting or rejecting a null hypothesis |
|
Definition
|
|
Term
In most cases, ________________ states a prediction that variables in the study are NOT related, e.g.: Cigarette smoking is unrelated to lung cancer Turning patients is unrelated to the incidence of pressure ulcers |
|
Definition
|
|
Term
The null hypothesis contrasts with researchers’ _____________, which typically states a prediction that variables in the study ARE related, e.g.: Cigarette smoking is related to lung cancer Turning patients is related to the incidence of pressure ulcers |
|
Definition
actual research hypothesis |
|
|
Term
is similar to English-based criminal justice system The accused is assumed to be innocent |
|
Definition
|
|
Term
Error Risk in Hypothesis Tests |
|
Definition
Without data from the population, researchers make decisions about accepting or rejecting the null hypothesis based on incomplete information |
|
|
Term
|
Definition
|
|
Term
The null hypothesis is really true in the population, and the researcher accepts it as true |
|
Definition
|
|
Term
The null hypothesis is really false in the population, and the researcher rejects it |
|
Definition
|
|
Term
The null hypothesis is really true in the population, but the researcher rejects it (a false positive) E.g., an ineffective intervention is erroneously considered effective |
|
Definition
|
|
Term
The null hypothesis is really false in the population, but the researcher accepts it (a false negative) E.g., an effective intervention is erroneously considered ineffective |
|
Definition
|
|
Term
Type I errors are controlled through |
|
Definition
the level of significance, the probability accepted as the risk of a false positive |
|
|
Term
The ______________ is the area in the theoretical probability distribution corresponding to a rejection of the null hypothesis |
|
Definition
level of significance or alpha (α) |
|
|
Term
The probability of committing a Type II error is called |
|
Definition
|
|
Term
Researchers cannot control β like they can control α, but they |
|
Definition
can take steps to reduce the risk of β (to increase power) |
|
|
Term
The most straightforward way to increase power is to |
|
Definition
|
|
Term
Researchers calculate a _______ using their sample data |
|
Definition
|
|
Term
They reject the null hypothesis if the test statistic falls |
|
Definition
at or beyond a critical region on the theoretical distribution for their test statistic; they accept the null hypothesis otherwise |
|
|
Term
When the null hypothesis is rejected, the results are |
|
Definition
statistically significant |
|
|
Term
If the null hypothesis is retained (whenever p > .05), |
|
Definition
the results are statistically nonsignificant |
|
|
Term
A statistically significant result is one that has a high probability of being |
|
Definition
“real” in the population, and probably does not merely reflect a chance fluctuation |
|
|
Term
Statistical significance does not mean the result is |
|
Definition
important, relevant, or clinically meaningful |
|
|
Term
A _______ is one that uses both tails of a sampling distribution to determine the critical region (the region for rejecting the null hypothesis) |
|
Definition
|
|
Term
A ________________ is one that uses only one tail of a sampling distribution in determining the critical region |
|
Definition
|
|
Term
A one-tailed test may be appropriate if |
|
Definition
the alternative hypothesis is directional |
|
|
Term
Two-tailed tests are more conservative (have less statistical power) than |
|
Definition
one-tailed tests, but researchers should have a strong justification for looking in only one tail |
|
|
Term
An ____________________ is a condition relating to the population that is accepted as being true without proof |
|
Definition
assumption for statistical tests |
|
|
Term
Most tests assume random sampling _______________ This assumption is widely ignored Ideally, though, samples are reasonably representative of the populations from which they are drawn |
|
Definition
|
|
Term
|
Definition
Involves estimating a population parameter Typically assumes the dependent variable is normally distributed in the population Has a dependent variable that is measured on an interval (or approximately interval) or ratio scale |
|
|
Term
|
Definition
Does not involves estimating a population parameter Makes no assumptions about how the dependent variable is distributed in the population (so they are sometimes called distribution-free statistics) Often involves a dependent variable that is measured on an ordinal or nominal scale |
|
|
Term
|
Definition
Easier to compute, no need to worry about distributional assumptions |
|
|
Term
|
Definition
More powerful (all else equal, they have lower probability of a Type II error) |
|
|
Term
|
Definition
Used when groups being compared are different, unrelated people |
|
|
Term
|
Definition
Used when groups being compared are the same people |
|
|
Term
A concept widely used in statistical testing Refers to the number of components that are free to vary around a parameter |
|
Definition
|
|
Term
|
Definition
Select the test statistic (which depends on a number of factors, like number of groups being compared) Specify level of significance (α) Decide on one-tailed versus two-tailed test Calculate test statistic using appropriate formulas Calculate degrees of freedom Compare test statistic to tabled value for appropriate df and α Decide whether to accept or reject the null hypothesis |
|
|
Term
|
Definition
Select the test statistic Specify level of significance (α) Decide on one-tailed versus two-tailed test Instruct the computer accordingly The computer will calculate the test statistic, df, and the actual probability level |
|
|