Term
the number of observations from the sample that are in each category of the variable we are studying. |
|
Definition
|
|
Term
the number of observations from the sample that we would expect to be in each category if the null hypothesis is true. |
|
Definition
|
|
Term
___ is useful in assessing specific forms of the relationship between variables (model equation), and prediction and estimation of one variable corresponding to a given value of another. |
|
Definition
|
|
Term
The ultimate objective of regression is to ___ or ___ the value of one variable (Y) based on the value of another variable (X). |
|
Definition
|
|
Term
___ is used to describe the strength of the linear relationship (or degree of the correlation) between variables. |
|
Definition
|
|
Term
The ___ ___ (r) measures the strength of the linear relationship between two quantitative variables. |
|
Definition
|
|
Term
It doesn’t matter which one you designate as x or y in ___ , but it matters in ___ |
|
Definition
|
|
Term
If we want to know the ___ of linear association between 2 variables, the correlation coefficient is useful |
|
Definition
|
|
Term
___ is a measure of how much two random variables change together. The sign of the ___ shows the tendency in the linear relationship between the variables. |
|
Definition
|
|
Term
What factors affect the size of r? |
|
Definition
-If x and y are nonlinearly related -If there is a restricted range -Extreme scores -Combining groups |
|
|
Term
-r≈0 indicates ___ linear correlation between x and y. -r<0 indicates a ___ or ___ relationship between x and y; as x increases y decreases, or vice versa. -r>0 indicates a ___ relationship between x and y; as x increases y increases. |
|
Definition
no; negative; inverse; positive |
|
|
Term
ρ = true correlation coefficient in ___ |
|
Definition
|
|
Term
Key word(s) for a regression problem are ___ or ___ |
|
Definition
|
|
Term
The ___ the difference between y and y ̂ , the closer the line comes to the actual data point. |
|
Definition
|
|
Term
the best fitting straight line is the one that minimizes the sum of the squared deviation between the observed values and the predicted values. |
|
Definition
|
|
Term
The proportion of the variance of Y explained by X. |
|
Definition
Coefficient of determination |
|
|
Term
This test is used to test the hypothesis that an observed frequency distribution fits some claimed distribution. |
|
Definition
|
|
Term
This tests the null hypothesis that there is no association between the row variable and the column variable in a contingency table. |
|
Definition
|
|
Term
True/False If we conclude that two variables are dependent we can determine a cause and effect link between the two variables. |
|
Definition
|
|
Term
In this kind of test we test the claim that different populations have the same proportions of some characteristics. |
|
Definition
|
|
Term
_____________ is the proportion of Y variance associated with X variance |
|
Definition
Coefficient of determination |
|
|
Term
A correlation coefficient of 0.8 indicates ___ linear relationship. (negative or positive). |
|
Definition
|
|
Term
If the test statistic falls in the ___ ___ you would reject the null hypothesis, and the p-value is smaller than the level of ___ |
|
Definition
rejection region; significance |
|
|
Term
What is the probability that the test statistic falls into the rejection region when the null hypothesis is correct? |
|
Definition
alpha (probability of committing type I error) |
|
|
Term
If the distribution is positively skewed is the median or mean larger? |
|
Definition
|
|
Term
For each variable listed below label what kind of variable it is: -Number of patients seen in a day -Presence of diabetes -Height (ft) -Blood glucose level |
|
Definition
-Discrete -Nominal -Continuous -Continuous |
|
|
Term
What test would you use for this: A researcher was interested in knowing if patients receiving instructions to practice mental imagery in addition to their regular rehabilitation would differ in their outcome scores at the end of therapy than those patient not instructed to practice mental imagery. A variance ratio test was conducted with a resulting p-value of .25. Given the sample mean scores and standard deviations for the two groups, is there sufficient evidence to conclude that the mean scores on the outcome measure will differ between those patients practicing mental imagery and those who do not? |
|
Definition
We are given samples and variance ratio values. We want to do a t test for 2 independent means. We will do the one with equal variances because p-value is greater than alpha. |
|
|
Term
What test would you use for this: Smith et al. investigated the daily consumption of fruits and vegetables by 4th grade children in Oklahoma City and Tulsa. Categories of consumption were 5 or more servings, 3-4 servings, and 0-2 servings. Can we conclude on the basis of these data that 4th graders in the two cities differ with respect to their levels of daily consumption of fruits and vegetables? |
|
Definition
Want to compare 2 or more samples. We are given categorical data. Frequency/proportion data. Leaves us with Chi-square. Which one would we use? Asking if 2 samples DIFFER in respect to a specific characteristic so we would use the test of Homogeneity. |
|
|
Term
What test would you use for this: According to the American Association of Orthodontists, the approximated distribution of malocclusions in the U.S. is 60% Class I, 30% Class II, and 10% Class III. A sample of patients at the OUHSC College of Dentistry Orthodontics Department is classified according to one of the three types of malocclusions. How well do the OUHSC data and U.S. population data fit? |
|
Definition
Comparing one sample to a US population. Given categorical data. Compare sample data to hypothesized distribution so we would use a goodness of fit test. |
|
|
Term
What test would you use for this: A school principal was interested in comparing three methods to reduce tardiness. A test was used to measure tardiness. A high score indicates frequent tardiness. 75 students were assigned into 3 groups of 25. Each group was disciplined using one of the methods above. After three months of the appropriate discipline, the test was performed to determine the amount of tardiness. Is there sufficient evidence to determine if there are differences among the mean scores for the three methods? Use a 0.05 significance level. |
|
Definition
Compare sample means of 3 groups. We would use ANOVA for independent samples. |
|
|
Term
What test would you use for this: Due to the increase in adult onset asthma, a study was conducted to examine the efficacy and safety of a new bronchodilator medication. Subjects were men over the age of 25 experiencing asthma-like coughing and wheezing. Data collected included pre- and post-medication spirometry lung capacity levels. Is there sufficient evidence at a .05 significance level to indicate the medication was successful at increasing functional lung capacity? |
|
Definition
Pre and post collections. Continuous data. Use a paired t-test between paired means. |
|
|
Term
What test would you use for this: An employer offered 2 different HMOs to its employees and scheduled a meeting during open enrollment for its employees to learn about the HMOs being offered. One of the HMOs conducted a survey of a random sample of 25 persons who attended the presentation. The individuals were surveyed immediately prior to the meeting and immediately following the meeting to see if there had been a change in their stated preferences for HMO coverage. The results were used to determine if the preference was altered by the presentation at an alpha of .05. |
|
Definition
Pre and post measures. Categorical data. So we would use the McNemar Chi-square test. |
|
|
Term
What test would you use for this: Just prior to the jumping into freezing water by the polar bear club, a sample of male and female participants were tested to find out their level of fear of the freezing water. The sample sizes and variances calculated from the data were as follows: Males: n = 24, s2 = 235 Females: n = 29, s2 = 276 Do these data provide enough evidence to show that the scores are more variable for females. |
|
Definition
Two samples. Compare samples to see if one is more variable or not. So we would use the variance ratio test. |
|
|
Term
What test would you use for this: A child psychiatrist wanted to examine the effect of music therapy on patients with ADHD. Four different music therapy treatments were to be evaluated. 15 patients with ADHD were randomly assigned to one of the four treatment groups. Is there sufficient evidence at the 0.05 level to determine if there are differences among the mean scores for the four methods? |
|
Definition
Comparing means of 4 different samples. We would use ANOVA for independent samples. |
|
|
Term
What test would you use for this? [image] |
|
Definition
Comparing means between 2 samples. T-test for independent samples with unequal variances because one of the SD's is twice as big when you square them. |
|
|
Term
What test would you use for this? Suppose we are studying the association if any, between the age of mother and the birth weight of her offspring (in grams). The age of the mother is grouped into two levels: those less than or equal to 20 years and those above 20 years. The birth weight is split into those less than or equal to 2500 gm and those greater than 2500 gm. Do the data provide sufficient evidence to suggest that age of mother and birth weight are related? |
|
Definition
Compare 2 groups of maternal age and offspring birth weight. Categorical data. Chi-square: when you read related you can associated dependent so you would use chi square test of independence. |
|
|
Term
What test would you use for this? In a survey of high school juniors and seniors, respondents were asked to identify their political preference as either ‘liberal’, ‘moderate’, or ‘conservative’. Can we conclude that the two populations differ with respect to their political preference? |
|
Definition
Comparing two samples. Categorical data. Use test of homogeneity because it asks if the populations differ based on a certain characteristic. |
|
|
Term
What test would you use for this? A study is analyzing the effect of a new drug on estrogen blood levels. Of fifty selected women, 25 are randomly allocated to receive the drug while the remaining 25 are to receive a placebo. Estrogen levels are measured following two weeks of treatment. The mean estrogen level of the study group was 125 ug/ml with a standard deviation of 12. The placebo group had a mean estrogen level of 98 ug/ml with a standard deviation of 7. Estimate the mean difference of the two populations. |
|
Definition
Comparing the mean of 2 samples. Asking us to "estimate" gives us a hint to use confidence interval. CI around a difference between 2 independent means with unequal variances. |
|
|
Term
In a study to measure nursing response time, 2 populations of nurses were timed in response to a cardiac event within a hospital setting. 25 intensive care nurses were studied along with 30 emergency department nurses. Is the variance in response time different between the 2 groups, at a .05 significance level? |
|
Definition
Asking for variance in response time so use variance ratio test. |
|
|
Term
What test: Researchers studied infants with normal deliveries. Birth weights and levels of the hormone AGX-5 in 26 cases were examined. The researcher is interested in predicting hormone levels based on birth weight. Is there sufficient evidence to indicate that there is a linear relationship between weight and AGX-5 levels? |
|
Definition
"Linear relationship" gives you a hint that you are either using correlation coefficient or regression, but the word "predicting" indicates regression. So use F test for linear regression. |
|
|
Term
What test: 108 juvenile offenders were surveyed. 48% reported a history of physical abuse. On the basis of these findings can we conclude that the general juvenile offender population has more than 40% having had a history of physical abuse. |
|
Definition
Given one sample and proportions. Want to know about the offender population so we would use a z-test for a single proportion. |
|
|
Term
The term "equally" cues you to use what kind of test? |
|
Definition
|
|
Term
What test: Women’s Fitness magazine tracked the number of self reported over-eating episodes per day of 100 women during a one-week period. At the 0.05 significance level, test the claim that women over-eat equally on all days of the week. |
|
Definition
Categorical frequency data. Test the claim that women over-eat equal on all days of the week. We want to use a goodness of fit test. |
|
|
Term
What test: One of the measures used in a study on the psychosocial impact of HIV/AIDS on a sample of residents living along the US/Mexico border regards barriers to obtaining HIV medical care. Researchers asked participants whether they were concerned that people at the agency providing HIV medications would not speak their language and whether they planned to receive HIV services. Is there an association between language barrier concern and receiving HIV services? |
|
Definition
Categorical data. Want to use a chi-square. Want to determine if there is an association or dependent between them so we use test of independence. |
|
|
Term
What test: A study was conducted on an Air Force Base to determine seatbelt use among military members by rank. Seatbelt use was recorded as members entered the gate for the following rank classes: Airmen, Noncommissioned Officers, Senior Non-commissioned Officers, Field Grade Officers, and Company Grade Officers. Can one conclude on the basis of the data that seatbelt use is equally distributed over the 5 rank classes? |
|
Definition
5 groups. Categorical data. Chi-square. Are they equally distributed? So we would use goodness of fit. |
|
|
Term
What test: A new drug on the market is supposed to reduce the cholesterol of patients within several weeks. Patients were tested for cholesterol levels and then re-tested again after 2 months. Estimate the mean difference in cholesterol level over the two month period. |
|
Definition
"Estimate"=confidence interval. Pre and post measures. Dependent data. CI are 2 dependent means. |
|
|
Term
What test: It is hypothesized that pregnant women who walk one mile a day will have a labor time of six hours or less. Two hundred thirty-two pregnant women in their first trimester entered the study from across the United States and were followed through their pregnancy. One hundred sixteen walked a mile or more a day while the others did not. Labor time was determined for each mother after birth. Is there a difference in the population mean labor time between those who walked one mile per day and those who did not? |
|
Definition
Compare 2 sample means. T test of independent means. Not given enough info to determine equal or unequal variances. |
|
|
Term
What test: PSA levels were measure in a sample of 30 men following 3 rounds of Lupron© injections. The sample mean was 2.6 and the sample standard deviation was 4.5. At a 0.1 significance level, can on conclude that the population mean is higher than 3? |
|
Definition
Asking for population mean. Test of single mean. T-test for single mean. |
|
|
Term
What test: The National Center for Drug Abuse conducted a study to determine if heroin usage among teenagers has changed. Approximately 1.2% of U.S. teenagers between the ages of 15 and 19 have used heroin one or more times. A recent survey of 2,438 teenagers indicated 48 had used heroin one or more times. At the .05 significance level is there evidence to conclude that more than 1.2% teenagers are using heroin? |
|
Definition
z test for one proportion. |
|
|
Term
130 children under the age of 10 born to HIV infected women are sampled. Their CD-4 count is taken and the mean CD-4 count is 180 cells per cubic millimeter of blood and the sample standard deviation is 8.5. 180 children under age 10 born to women without HIV/AIDS are sampled. Their mean CD4 count is 700 cells per cubic millimeter of blood and the sample standard deviation is 8.9. Is the population mean CD4 count between children born to HIV infected mothers and those born to mother not infected with HIV different? |
|
Definition
2 samples. Ratio data. Comparing means. T test between independent means. Are the variances equal? Equal. |
|
|
Term
A sample of 211 people were polled about their gender and whether they had a bachelors degree. A researcher would like to know if there is an association between gender and having a bachelors degree. Use a .05 significance level. |
|
Definition
Categorical data. Wants to know if there is an association so we use chi-square test of independence. |
|
|
Term
What test: Data was collected on a sample of 531 minority men ages 20-40. The variable of interest was SBP. 147 minority men were diagnosed with high SBP. High SBP estimates for white men ages 20-40 are 14.3%. Is there enough evidence to suggest that minority males ages 20-40 have a prevalence of high SBP greater than 14.3%? Let alpha = .05. |
|
Definition
Given one sample and a proportion. Compare the proportions. Z test for one proportion. |
|
|
Term
What test: Stress ECG tests were performed for a group of males with Heart Disease and without Heart Disease. Out of the 100 patients with heart disease 70 developed ventricular fibrillation and out of the 100 without the disease 10 developed ventricular fibrillation (VF). Can we conclude that the proportion who develop VF is greater among those with the disease than those without the disease? α=0.05. |
|
Definition
2 samples. Comparing proportions. Use a z-test for 2 independent proportions. |
|
|
Term
What test: In a recent survey, 900 randomly selected teenagers (ages 13-16) were asked “Do you have a good relationship with both of your parents?” 47 percent of the subjects said that they had a good relationship with both parents. Consider a hypothesis test that uses an alpha of .05 to test the claim that more than 40 percent of teenagers (ages 13-16) have a good relationship with both of their parents. |
|
Definition
Categorical. Given proportional data. Z test for 1 proportion. |
|
|
Term
Assume that we wish to test whether two skin lotions are equally effective at relieving a poison ivy rash. The two lotions are tested on 40 persons with poison ivy rash on arms, applying one lotion to one arm and the other lotion to the second arm. The relief scores (range: 1 to 100) are collected for each arm. Do these data provide sufficient evidence, at the 0.05 level of significance, to indicate that in general the maximum relief is lower for lotion 1? |
|
Definition
Dependent data because they are applying one lotion to one arm and the other lotion to the other arm of the same patient. Paired t test because it's not categorical data. |
|
|
Term
What test: The American Apple Growers Association (AAGA) wanted to examine the strength of the relationship between the number of days on which a single apple was eaten and the number of doctor visits during early childhood. Parents of 200 hundred pre-school children participated in a 3-year study by recording data on the apple consumption and doctor visits of their children. Is there sufficient evidence to indicate that there is a linear relationship between apple consumption and doctor visits? |
|
Definition
Asking for linear relationship. "Strength of the relationship" indicates correlation coefficient. |
|
|
Term
What test: The mean weight of 30 soccer players is 155 pounds with a standard deviation of 1.2. A sample of 25 football players has a mean weight of 202 pounds with a standard deviation of 4. Is there sufficient evidence to conclude at the 0.05 level of significance that the football players have a higher mean weight than the soccer players? |
|
Definition
2 sample means. Does one have a higher mean? Ratio data. T-test for independent means. Are variances equal? Unequal variance. |
|
|
Term
What test: The Zortian method is a treatment for brain tumors using traditional chemotherapy and radiation in combination with a drug called Zortican. Dr. Manzetti treated 9 patients suffering from brain tumors using the Zortian method and then used traditional MRI technology to assess the reduction in size of the tumors 48 hours after treatment. The pre and post treatment data was collected. Do the results indicate there is sufficient evidence to conclude that the Zortian method improves tumor size? Assume the difference in tumor size is normally distributed. |
|
Definition
Dependent data. Compare means so paired t test for dependent means. |
|
|
Term
What test: On a medical history form, patients were asked to indicate their gender and how often they consumed alcohol in days/month. A variance ratio test between male and females was conducted and resulted in a p=.0003. Can we conclude that males and females differ with respect to how often they consume alcohol? |
|
Definition
Ratio data-recorded in days per month. T test for independent means. Unequal variances because p-value is less than alpha. |
|
|
Term
What test: It is known that women have a mean and standard deviation of 10 minutes and 2.2 minutes respectively for a one mile run test. A sample of 30 women indicates a mean of 10.2 minutes. Test the claim that the sample comes from a population with a mean that is different from 10 using and alpha of 0.05. |
|
Definition
One sample and population mean. Z test for one mean |
|
|