Term
|
Definition
| participants divided by random |
|
|
Term
|
Definition
| trait or characteristic with two or more categories |
|
|
Term
|
Definition
| stimulus or input. "CAUSE" Researchers usually physically manipulate the independent variable; physically administer treatment. (In non-experiemental studies, researchers do not physically manipulate the IV. They observe how the cured naturally) |
|
|
Term
|
Definition
| outcome ("EFFECT"). What you measure |
|
|
Term
|
Definition
| Degree to which measures actually measure what they intend to measure. Looks at the test |
|
|
Term
|
Definition
| if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension. |
|
|
Term
|
Definition
| ability to generalize beyond sample and conditions that yielded the findings, to the population. Directly tied to sampling. |
|
|
Term
|
Definition
| to what extent does the test predict the outcome it is suppose to predict? |
|
|
Term
| Correlation (validity) coefficient |
|
Definition
| 0.00 (no relationship) to 1.00 (perfect validity) |
|
|
Term
|
Definition
| onstruct validity refers to whether a scale measures or correlates with the theorized psychological scientific construct (e.g., "fluid intelligence") that it purports to measure. |
|
|
Term
|
Definition
| Reliable and not valid but no valid and unreliable. Reliability is for scores & the validity is for tests. |
|
|
Term
| Inter-observer reliability coefficients |
|
Definition
| agreement between observers. r 2 different raters' scores |
|
|
Term
|
Definition
| measure at two points in time. Exact same form. Wants to show difference. What is scored is reliable. Make sure they are answering the same the whole time. Do not what a difference in scores. |
|
|
Term
| parallel-forms reliability |
|
Definition
| Same content, different form. 2 parallel forms, interchangeable, different items that cover the same content. |
|
|
Term
|
Definition
| use scores from a single administration of a test to examine the consistency of test scores. Items consistent with each other. Person who answers certain was in any item is likely to answer in same way on other items in the same scale |
|
|
Term
|
Definition
| Score the test as though they consisted of two separate tests (odd-even spilt). Interval |
|
|
Term
|
Definition
| Items are consistent with each other; Tests for internal consistency. single administration of a test; math is done to obtain the equivalent of the average of all possible spit-half reliability coefficients. Not attributed over time. Most common. Shoot for hight score. |
|
|
Term
|
Definition
| tests designed to facilitate a comparison of an individual's performance with that of a norm group (percentile rank). Meant to be of medium difficulty. |
|
|
Term
| Criterion-referenced tests |
|
Definition
| measures the extent to which individual examinees have met performance standards. Difficulty not a concern. |
|
|
Term
| Pretest-posttest randomized control group design |
|
Definition
| if the experimental group differs than it is attributed to either the treatment or error. WANT A DIFFERENCE |
|
|
Term
|
Definition
| degree to which results of study can be attributed to treatments or other independent variables. |
|
|
Term
| Threats to internal validity |
|
Definition
1. history: environmental influences (ex. events that occur during research) *2. maturation: subjects matured *3. testing effects: things learned from pretest influenced later behavior 4. ceiling effects: many scores are near max possible. May not detect true differences 5. Floor effects: many scores near minimum possible. May not detect true differences 6. instrumentation: changes in measurement. Groups take tests under different conditions |
|
|
Term
|
Definition
| 2 groups, not randomly selected |
|
|
Term
|
Definition
| attention effect. when the subjects know they are being studied |
|
|
Term
|
Definition
| control group may try to outperform the experimental group |
|
|
Term
|
Definition
| person dispensing drug doesn't know either |
|
|
Term
|
Definition
| participants know what experimenters are looking for |
|
|
Term
|
Definition
| does not mean meaningful significance. |
|
|
Term
|
Definition
| examine relationships between groups |
|
|
Term
|
Definition
| interval data, distributed normally in population |
|
|
Term
|
Definition
| summarize data so that it can be easily comprehended |
|
|
Term
|
Definition
| displays how scores are distributed. |
|
|
Term
|
Definition
| help researchers draw inferences about the effects of sampling errors on the results that are described with descriptive statistics. Helps researchers make generalizations about the characteristics of the populations on the basis of data obtained by studying samples. Used to infer from our sample to the population. Generalize. 3 Types: Chi-Quare, t-test, ANOVA |
|
|
Term
|
Definition
| help readers interpret results in light of sampling error. |
|
|
Term
|
Definition
| likelihood that finding sample exists in population. Any difference is due to real influence, rather than chance or sampling error. |
|
|
Term
|
Definition
| helps researchers decide whether the differences in descriptive statistics statistics they identify are reliable. Determine the probability that the null hypothesis is true. |
|
|
Term
|
Definition
| helps distinguish between values obtained from sample and values obtained from a census |
|
|
Term
|
Definition
| test that allows you to determine test of null hypothesis from differences between frequencies. categorical data analyzed. can be used for treatments also (post-test looks just like pre-test) |
|
|
Term
|
Definition
| total minus 2. probability that the null hypothesis is correct. Subset step for obtaining value of p. |
|
|
Term
| When null hypothesis not rejected p > .05 |
|
Definition
| statistically insignificant. |
|
|
Term
| When the probability that the null hypothesis is correct is or less than .05 |
|
Definition
| reject the null hypothesis. Statistically significant. Less than 5% chance due to sampling error. |
|
|
Term
|
Definition
| each participant is classified in terms of two variables in order to examine the relationship between them. |
|
|
Term
|
Definition
| when a null hypothesis is rejected and it is in fact a correct hypothesis. |
|
|
Term
|
Definition
| when researchers fail to reject the null hypothesis when it is incorrect |
|
|
Term
|
Definition
| variability (shows how much variation; S or SD (population) sd (sample), "on average the scores varied from the mean___" How much the score varies from the mean. |
|
|
Term
| Pearson Correlation Coefficient (Pearson r) : |
|
Definition
| relationship between 2 quantitative sets of scores. |
|
|
Term
|
Definition
| (positive relationship): high in both areas |
|
|
Term
| Inverse relationship (negative relationship): |
|
Definition
| high in one variable and low in the other. |
|
|
Term
|
Definition
| set of ratings; standard by which the test is being judged |
|
|
Term
|
Definition
| to what extent does the test predict the outcome it is suppose to predict? |
|
|
Term
| Concurrent Validity Coefficient: |
|
Definition
| obtained by administering the test and collecting the criterion data at about the same time. Happening right now. |
|
|
Term
|
Definition
| relies on subjective judgements and empirical data. Hypothesize a relationship between the test scores and scores on another variable. Examples: score on a depression scale and success in college. |
|
|
Term
|
Definition
| collection of related behaviors that are associated in a meaningful way (ex. depression) |
|
|
Term
|
Definition
how variables are related to one another. Relationship between 1. positive/ direct: both high or both low (same direction) 2. negative/inverse relationship: opposite directions 3. allows us to predict either variable from knowledge of the other. Always exceptions bc never have perfect relationship 4. non-experimental: no IV and DV 5. Cannot be used to determine causation. |
|
|
Term
|
Definition
| interval data (parametric). Difference between group mean scores. The t test: used to test the null hypothesis regarding the observed difference between two means. |
|
|
Term
| One Way Analysis of Variance (ANOVA) |
|
Definition
: F value similar to t-test Indicates if the null hypothesis is correct ANOVA: can compare many means (t test can only compare 2) or if sample sizes are large and unequal ANOVA: indicates whether a set of differences is significant overall. Separating variance due to" * within group (chance) * between groups (treatment) |
|
|
Term
|
Definition
| one treatment, many groups. participants classified in only one way. One factor being explored and 3 or more groups within this factor. Rather than run multiple t-tests to determine if one is statistically significant (but will not tell you which one) |
|
|
Term
|
Definition
| two-way classification, also know as the main effect. more than one treatment factor being explored. Explores interactions among variables. 3 x 2 factorial design. |
|
|
Term
|
Definition
| effect of an independent variable on a dependent variable averaging across the levels of any other independent variables |
|
|
Term
|
Definition
| do they effect on another? dependent? interaction between variables. When graphed: do lines cross? |
|
|
Term
| Statistical significance: |
|
Definition
| whether a difference is reliable in light of random errors. |
|
|
Term
|
Definition
| one treatment, many groups. participants classified in only one way. One factor being explored and 3 or more groups within this factor. Rather than run multiple t-tests to determine if one is statistically significant (but will not tell you which one) |
|
|
Term
|
Definition
| two-way classification, also know as the main effect. more than one treatment factor being explored. Explores interactions among variables. 3 x 2 factorial design. |
|
|
Term
|
Definition
: can compare many means (t test can only compare 2) or if sample sizes are large and unequal; indicates whether a set of differences is significant overall. Separating variance due to" * within group (chance) * between groups (treatment) |
|
|
Term
|
Definition
| : interval data (parametric). Difference between group mean scores; used to test the null hypothesis regarding the observed difference between two means. |
|
|
Term
| if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension. |
|
Definition
|
|
Term
|
Definition
| collection of related behaviors that are associated in a meaningful way |
|
|
Term
|
Definition
- ensure that subgroups w/in population represented proportionally in sample - way to decrease sampling error b/c sample more representative of pop. - can also use to make sure subgroups represented equally |
|
|
Term
|
Definition
1. Cannot i.d. every member of the population 2. Convenience samples 3. Volunteerism |
|
|
Term
| Significance and Same size |
|
Definition
Significance becomes more likely as sample size increases. As a general rule, the larger the random sample, the smaller the sampling error, or, the more precise the results are - Precision = extent to which same results would be obtained if another random sample were drawn from same population - Increasing sample size also produces diminishing returns |
|
|
Term
|
Definition
| Degree to which measures produce consistent results. to create measures that consistently show difference b/t individuals who really are different, and show same scores for individuals who are the same. Reliability of Scores, Not Tests. |
|
|
Term
|
Definition
theoretical construct referring to a person’s score containing no error - actual amount of whatever being measured (ability, self-esteem, knowledge, etc) |
|
|
Term
|
Definition
| – difference between person’s true score & score actually obtained |
|
|
Term
| Factors that might cause measurement error: |
|
Definition
1. Test’s items only sample of total possible items might be used to measure construct 2. Test administrators 3. Test scorers 4. Testing conditions 5. Variability in how individuals feel |
|
|
Term
| Criterion-Referenced Tests |
|
Definition
test items relate to instructional objective - criteria for “success” determined ahead of time - no distribution of scores is done |
|
|
Term
|
Definition
not meant to show attainment of specific learning objectives - how students/schools/etc compare with each other - individual score translated to converted score to determine “relative standing” - intended to disperse scores across normal curve |
|
|
Term
|
Definition
| Study exploring relationship between teachers’ culturally held beliefs and student achievement. |
|
|
Term
|
Definition
| choose an extreme group on any 1 measure, will tend to be less extreme on another measure, even if 2 measures highly correlated |
|
|
Term
|
Definition
| when occur at random and equally among the groups, not a problem |
|
|
Term
|
Definition
|
|
Term
| This would be used to figure out the mean semester gpa differences between involved and uninvolved students |
|
Definition
|
|
Term
| Painters and dancers scored higher than acountants on a researcher’s creativity test. What type of validity does this demonstrate |
|
Definition
|
|
Term
| A study investigated the impact of leadership development program in students first year with the subsequent leadership behaviors of the contents senior year program. What are the IV and DV. |
|
Definition
| IV = program DP = behaviors |
|
|
Term
| A study is conducted to measure the extent to which alcohol use drug use and violence affects grades for high school students. What test should be used A study is conducted to measure the extent to which alcohol use drug use and violence affects grades for high school students. What test should be used |
|
Definition
|
|
Term
| A researcher wants to study the relationship between act scores and GPA |
|
Definition
|
|
Term
| What sampling method would be used if a teacher uses students in their class |
|
Definition
|
|
Term
| Professor X conducts a study over a 2 year period. During this time 20 of the original 75 drop out |
|
Definition
|
|
Term
| A professor gives a test on US history, however most of the questions are direct on german history. What type of measurement validty is being threatened |
|
Definition
|
|
Term
| A researcher tested the relationship between emotional intellenges and empathy. The correlation between these 2 construct was .72 . what do these results mean = |
|
Definition
| positive correlation WHAT DOES THIS MEAN |
|
|
Term
| A researcher wanted to know the communities feelings about the library hours. They sat in front of the library and asked volunteers about the survey. What type of sampling = |
|
Definition
|
|
Term
| A researcher is comparing the average nuber of hours a student studies every year of school. The researcher gets a t-value of 4.30 w/ 2df. The researcher decides to reject the null hypothesis @ the .05 level. Upon further researcher there was no significance. What type of error was this = |
|
Definition
|
|
Term
| Dr. Tammy admisinsters a test regarding leadership behaviors and collects results using criterion data. Which type of validity is this? |
|
Definition
|
|
Term
| If a researcher makes judgements about the appropriateness of the contents in a measure, you are checking for |
|
Definition
|
|
Term
| Relies on subjective judgements and empirical data |
|
Definition
|
|
Term
|
Definition
| when occur at random and equally among the groups, not a problem |
|
|
Term
| Threats to External Validity |
|
Definition
1. Nonrepresentativeness - is sample representative of larger population? 2. Artificiality - findings of small, brief, or contrived study not apply to realistic setting 3 Types: - non-typical task - non-typical instruction - control group |
|
|
Term
|
Definition
| describes things as is or once were - uses descriptive stats |
|
|
Term
|
Definition
(experimental research design) - compare 2 + groups - done to determine existence & nature of difference - uses inferential stats |
|
|
Term
|
Definition
|
|
Term
|
Definition
– Involves interval data, assumed to be distributed normally in population |
|
|
Term
|
Definition
- involves data not assumed to be normally distributed in population - frequently used when data can be placed in to categories |
|
|
Term
| Compute the test statistic value: |
|
Definition
|
|
Term
|
Definition
| value you would expect the test statistic to yield if the null hypothesis is indeed true |
|
|
Term
|
Definition
- used to infer something about the population based on the sample 19s characteristics - how much confidence can have when generalizing from a sample to a population |
|
|
Term
|
Definition
| Purposive, smaller sample size (not random), should include demographic info for decisions of transferability |
|
|
Term
|
Definition
| interviews (focus groups), observations, text rich data (journals, open-ended questionnaires) |
|
|
Term
| Trustworthiness includes... |
|
Definition
| Truth value/ credibility; Transferability; Consistency/ dependability; Confirmability |
|
|
Term
| Methods to reach trustworthiness |
|
Definition
| 1. Prolonged engagement 2. persistent observation 3. triangulation 4. peer debriefing 5. negative case analysis 5. referential adequacy 6. member checking |
|
|
Term
|
Definition
| Researcher aware of multiple influences & contextual factors that influence phenomenon. Potential Danger: 1Cgoing native 1D |
|
|
Term
|
Definition
| I.d. characteristics most relevant to issue being pursued & focus on them. Potential Danger: premature closure 13 come to focus too soon |
|
|
Term
|
Definition
Way to support findings by showing that independent measures of it agree (corroborate). 3 Types: Data source (persons, times, places) Method (observation, interview, documents) Researchers (investigator A, B, C) |
|
|
Term
|
Definition
| Process of exposing self to peer to explore aspects of inquiry that may otherwise remain only implicitly w/in R 19s mind |
|
|
Term
|
Definition
| Process of revising hypotheses. Often linked to persistent observation. Requires R to look for disconfirming data in both past & future data |
|
|
Term
|
Definition
| Recorded material provides 1Cbenchmark 1D against which later data analysis & interpretation can be tested for adequacy. |
|
|
Term
|
Definition
| Recorded material provides 1Cbenchmark 1D against which later data analysis & interpretation can be tested for adequacy. |
|
|
Term
|
Definition
Process of sharing data/findings with participants. Purposes: Opportunity for participants to correct errors &/or challenge interpretations Opportunity to volunteer additional information Puts participants on record as agreeing w/ R 19s interpretation |
|
|