Term
statistical inference (inferential statistics) |
|
Definition
the statistician's process of generalizing results from a sample to a population |
|
|
Term
What does estimate accuracy depend on? |
|
Definition
1) How representative the sample is of the general population 2) The degree of sampling error |
|
|
Term
|
Definition
The feature or characteristic of a population whose value you want to determine (e.g. the percentage of the population with chlamydia) |
|
|
Term
|
Definition
result (like percentage) based on the sample population |
|
|
Term
|
Definition
to hypothesize that a population parameter has a particular value, and then see if the value of the corresponding sample statistic is compatible with your hypothesis |
|
|
Term
|
Definition
a measure of the chance of getting some outcome of interest from some event |
|
|
Term
On what scale do we measure probability? |
|
Definition
Between 0 and 1 0 is impossible, 1 is inevitable |
|
|
Term
If the probability of an event happening is p, what is the probability of the event not happening? |
|
Definition
|
|
Term
How can you calculate the probability of an event happening? |
|
Definition
it is the number of outcomes that favor (aka "fulfill") that event, divided by the total number of possible outcomes. |
|
|
Term
proportional frequency approach |
|
Definition
a method of calculating probability which uses existing frequency data as the basis for probability calculations (because not all outcomes have the same probability as something like flipping a coin would) |
|
|
Term
If data is Normally distributed, what percentage of values will lie no further than two standard deviations from the mean? |
|
Definition
|
|
Term
|
Definition
same definition as probability favored in the clinical arena |
|
|
Term
|
Definition
the risk for a single group (terminology distinguishes it from relative risk) |
|
|
Term
|
Definition
the risk for one group compared to the risk for some other group |
|
|
Term
How can you calculate the "odds" of an event happening? |
|
Definition
odds are equal to the number of outcomes favorable to the event divided by the number of outcomes not favorable to the event. (55 blue, 40 green; 55/40= 1.375. The odds are 1.375 to 1 for blue) |
|
|
Term
What are the main differences between risk/probability and odds? |
|
Definition
-The range of risk is between 0 and 1. The range of odds are between 0 and infinity -When the odds <1, the event are unfavorable to the outcome. -When the odds =1, the event is as likely to happen as it is not to happen -When the odds are >1, the odds are favorable to the outcome |
|
|
Term
|
Definition
odds in health statistics are expressed as ‘something’ to one. This value of one is called the reference value. |
|
|
Term
The connection between probability and odds gives us the ability to do what? |
|
Definition
Derive one from the other risk or probability = odds/(1 + odds) odds = probability/(1 – probability) |
|
|
Term
How can you calculate relative risk (aka risk ratio)? |
|
Definition
divide the risk for one group (usually the one exposed to the risk) by the risk for the second, non-exposed group |
|
|
Term
How can you calculate odds ratio? In what type of study would you find this? |
|
Definition
Divide odds that those with a disease will have been exposed to the risk factor, with the odds that those who don’t have the disease will have been exposed A case control study (e.g. odds that those with a stroke had exercised is 0.78; odds that those without a stroke had exercised is 1.97, odds ratio is 0.78/1.97= 0.40) |
|
|
Term
One cannot calculate risk in what type of study? Why? What can you use instead? |
|
Definition
case-control study (however, the odds ratio is reasonably good estimate)
in a case-control study you don’t select on the basis of whether people have been exposed to the risk or not, but on the basis of whether they have some condition (a stroke) or not. BOTH groups will contain individuals who were and were not exposed to the risk (see pg 103) |
|
|
Term
What does NNT stand for, and what is it? |
|
Definition
Number needed to treat NNT is the number of patients who would need to be treated with the active procedure, rather than a placebo (or alternative procedure), in order to reduce by one the number of patients experiencing the condition |
|
|
Term
What is ARR and what does it stand for? |
|
Definition
Absolute risk reduction The difference between two absolute risks (e.g. the reduction in risk gained by weighing more than 18 lbs at one year rather than weighing 18 lbs or less) |
|
|
Term
What is the relationship between NNT and ARR? |
|
Definition
|
|
Term
confidence interval estimator |
|
Definition
a numeric expression that quantifies the likely size of the sampling error |
|
|
Term
the mean of all possible sample means is the same as what? |
|
Definition
|
|
Term
|
Definition
a measure of the spread of the data in a SINGLE sample |
|
|
Term
What is this equation used to do? s.e.(x ̄) = s/ (sqrt n) |
|
Definition
estimate the standard error |
|
|
Term
|
Definition
a measure of the preciseness of the sample mean as an estimator of the population mean (smaller is better) |
|
|
Term
confidence interval (equation) |
|
Definition
(the distance from the sample mean – 2 × s.e.(x ̄), to the sample mean + 2 × s.e.(x ̄) (if you pick one out of all the possible sample means at random, there is a probability of 0.95 that it will lie within two standard errors of the population mean; this is the 95% confidence interval estimate) |
|
|
Term
A confidence interval is said to represent what? |
|
Definition
a plausible range of values for the population parameter |
|
|
Term
Make sure you know how to calculate confidence interval for a population proportion using the equation, pg 116-117 |
|
Definition
|
|
Term
What is the most common application of confidence intervals? |
|
Definition
the comparison of two population parameters, for example between the means of two populations, such as the mean age of a population of women and the mean age of a population of men |
|
|
Term
Name the prerequisites for the two-sample t test |
|
Definition
-data for both groups must be metric -the distribution of the relevant variable in each population must be reasonably Normal -The population standard deviations of the two variables concerned should be approximately the same, but this requirement becomes less important as sample sizes get larger |
|
|
Term
What can you do if you want to know if there is a statistically significant difference between two population means? |
|
Definition
calculate the 95 per cent confidence interval for the difference and see if it contains zero. If it does, you can be 95% confident that there is a statistically significant difference in the means. |
|
|
Term
For what do we use the two-sample t test? |
|
Definition
estimating the difference in the means of two independent populations |
|
|
Term
|
Definition
-Used in place of the 2-point t test -Compares population medians rather than the means -Only requires that the two population distributions have the same approximate shape, but does not require either to be Normal. -It is the non-parametric equivalent of the two-sample t test |
|
|
Term
|
Definition
can be applied to data which is metric, and also has some particular distribution, most commonly the Normal distribution (non-parametric doesn't make distributional requirements) |
|
|
Term
Briefly describe the Mann-Whitney method |
|
Definition
-Starts by combining the data from both groups, which are then ranked. -The rank values for each group are then separated and summed. -If the medians of the two groups are the same, then the sums of the ranks of the two groups should be similar. However, if the rank sums are different, you need to know whether this difference could simply be due to chance, or is because there really is a statistically significant difference in the population medians. (decide using confidence interval) |
|
|
Term
When should one use the Wilcoxen test? |
|
Definition
When two groups are matched Ordinal data or skewed metric data
You can obtain confidence intervals for differences in population medians based on this test It is non-parametric |
|
|
Term
ratio of two independent population means tells what? |
|
Definition
tells you how many times bigger one population mean is than another. If sample ratio is different than 1, need to find out if it's due to chance or if the difference is statistically significant. |
|
|
Term
If the confidence interval for the ratio of two population parameters does not contain the value 1, then... |
|
Definition
you can be 95% confident that any difference in the size of the two measures is statistically significant. |
|
|