Term
|
Definition
Tx already validated and has been proven effective CON: Nothing really validated all the time; no complete success |
|
|
Term
|
Definition
Tx supported with effective empirical studies, emphasizes empirical RS Requires positive outcomes CON: Evidence in Q = Empirical in nature, sometimes any Tx is better than no Tx |
|
|
Term
|
Definition
Txs empirically evaluated CON: Misleading because have been therapies that have been evaluated, but found to not be supportive |
|
|
Term
|
Definition
- RS strategies designed to provide or allow a full and thorough description of a specific sample
- Not intended to make b/n grp. comparisons or to provide inferential data
|
|
|
Term
|
Definition
- RS strategies used to provide inferential statistics from a sample to be generalized to a target population which the original sample was supposed to represent
- Variables/RS that can be handled numerically
|
|
|
Term
|
Definition
"Working Definition"
-
Defines a concept in such a way that it can be measured
-
Every definition has limits
-
Captures only a portion of the concept
-
May include irrelevant Information
-
Make intersubjectivity (objectivity) possible BC they can be replicated
-
ALWAYS imperfect, usually being artificial or too narrow (eg. Defn of an overweight person as one with BMI >25)
|
|
|
Term
|
Definition
- RS which examines a Tx or phenomena under conditions which approximate the "real world" or clinical settings
-Eg. Bio-dome experiments - artificial settings to look at a natural bx (there is some control over the environment)
- Said of data or computers that use a system of representation that is physically analogous to the thing being represented (Eg. a thermometer can use the height of a column of mercury to indicate heat - the higher the column, the higher the temperature; Analog watches use the physical mov't of hour and minute hands to represent the passing of time - the more the hands have moved, the more time has passed)
|
|
|
Term
|
Definition
- Random assignment
- Manipulation of the IV
- Use of statistical analyses (ANOVA, MANOVA, ANCOVA)
- Control group
- Have the greatest control over independent variables and sources of error
- Provide the clearest case for drawing causal inferences
- Include b/n subjects and within subjects (repeated measures) independent variables
1. b/n subjects
2. b/n subjects or ANOVA
3. within-subjects
4. repeated measures design |
|
|
Term
B/n subjects designs (aka ANOVA) |
|
Definition
- Each person serves in only one group and the groups are compared
- Each score in the study comes from a different subject compares different subjects
|
|
|
Term
|
Definition
- Each person serves in every condition Before and after study or a study of the same subjects given different Txs
- NO Control Group: Compare Pre/posttest within the same group of subjects
- Time = most common within-subjects variable
|
|
|
Term
|
Definition
- Subjects are measured 2+ times on the DV
- Subjects are given >1 Tx and are measured after each level of Tx -
- Each subject serves as its own control
|
|
|
Term
How is a true experiment different from other RS designs? |
|
Definition
1. Subjects randomly assigned to Tx groups
2. Manipulate the IV |
|
|
Term
Quasi-Experimental Designs |
|
Definition
- NO random assignment
- Manipulation of at least one IV
- Statistical analyses (ANOVA, ANOVA, MANOVA)
- Approximate the control offered by experimental designs
- Used in real life or field situations where RSer may be able to manipulate some IVs but cannot truly randomly assign subjects to control and experimental groups (e.g. study with volunteers)
|
|
|
Term
|
Definition
aka "Static Group/Case Control"
- IV NOT manipulated (Use measures of association to study relations)
- NO random assignment
- Statistical analysis used include Chi-Square, Regression, Correlations
- Observational or descriptive: Does NOT allow for inferences about causal relationships
- Variable of interest is studied by selecting participants who vary in their display of the variable or characteristic of interest
|
|
|
Term
|
Definition
aka Random error, random variance, and residual
- Variance of the error term
- Any uncontrolled or unexplained variability, such as within-group differences in an ANOVA
- All the variations within each condition of your IV (eg. experimental group & control group) -
- Assumed to be randomly distributed
- The "noise" or variance, can increase probability of Type II error (not finding significant differences when they exist)
|
|
|
Term
"Noise" can increase the probability of a ______. |
|
Definition
Type II error - not finding significant differences when they exist |
|
|
Term
|
Definition
All the ways something can differ
-eg. Variance within La Jolla in comparison to variance in San Diego |
|
|
Term
|
Definition
aka "secondary variance"
- Systematic differences b/n groups that are not accounted for by the treatment effect
- Threat to Internal Validity
- Measure of the joint or covariance of 2+ variables
- Differences that are NOT randomly distributed across groups
- Variable that is a potential confound in the study and you have found that it does affect your DV
|
|
|
Term
|
Definition
- Group that you are interested in generalizing your results to
- Parameters must be specific
|
|
|
Term
|
Definition
The selected subset of your population
sample mean = population mean |
|
|
Term
|
Definition
The extent to which the intervention or manipulation of the IV can be considered to account for the results, changes or group difference, rather than extraneous influences
|
|
|
Term
|
Definition
- Random assignment
- Manipulation of the IV
- Control group
- Use of statistical analyses (ANOVA, MANOVA, ANCOVA)
- Provide clearest case for drawing causal inferences
- Have greatest control over IVs and sources of error
|
|
|
Term
|
Definition
-True Experimental Design
-Each person serves in only one group and the groups are compared |
|
|
Term
Between-Subjects Designs (ANOVA) |
|
Definition
- True Experimental Design
- Each score in the study comes from a different subject
- Usually contrasted to a within-subjects design, which compares the same subjects at different times or under different treatments
|
|
|
Term
|
Definition
- Each person serves in every condition
- No Control Group: Pre/Posttest from within the same group
- A before-and-after study or a study of the same subjects given different treatments
- Time is the most common within subjects variable
|
|
|
Term
|
Definition
- Subjects are given more than one treatment and measured after each
- Each subject is its own control
- Subjects are measured on the DV
|
|
|
Term
List 5 subtypes of the Experimental Design. |
|
Definition
1. Pre-Post Control
2. Post Only Control
3. Solomon 4 Group
4. Time Series
5. Factorial |
|
|
Term
Experimental Design:
A. Descriptive of Causal?
B. Random Assignment?
C. Static or Manipulated IVs?
D. Control Group?
E. Statistics Used |
|
Definition
EXPERIMENTAL DESIGN:
A. Causal
B. Randomly Assigned
C. Both Static & Manipulated IVs
D. Control Group
E. a. t-test
b. ACOVA
c. ANCOVA
d. MANOVA |
|
|
Term
Quasi-Experimental Design:
A. Descriptive of Causal?
B. Random Assignment?
C. Static or Manipulated IVs?
D. Control Group?
E. Statistics Used |
|
Definition
QUASI-EXPERIMENTAL:
A. Causal
B. Yes & No Random Assignment
C. Both Static and Manipulated IVs
D. Yes & No Control Group
E. Statistics Used:
a. ANOVA
b. ANCOVA
c. MANOVA
d. Regression |
|
|
Term
Name 3 subtyes of Quasi-Experimental Design. |
|
Definition
1. Multiple Treatment
2. Counterbalanced
3. Crossover |
|
|
Term
CORRELATIONAL DESIGN Experimental Design:
A. Descriptive of Causal?
B. Random Assignment?
C. Static or Manipulated IVs?
D. Control Group?
E. Statistics Used |
|
Definition
CORRELATIONAL DESIGN Experimental Design:
A. N/A
B. Descriptive
C. NO Random Assignment
D. Static IVs
E. NO Control Group
F. Statistics Used:
a. Chi-square
b. Regression
c. Correlation |
|
|
Term
Name 5 subtypes of an OBSERVATIONAL Design. |
|
Definition
Name 5 subtypes of an OBSERVATIONAL Design.
1. Case Study
2. Case Control
3. Cross-Sectional
4. Retrospective Cross-Sectional
5. Cohort:
a. Prospective
b. Longitudinal
c. Single Group
d. Multiple Group
e. Accelerated |
|
|
Term
OBSERVATIONAL DESIGN: Experimental Design
A. Descriptive of Causal?
B. Random Assignment?
C. Static or Manipulated IVs?
D. Control Group?
E. Statistics Used |
|
Definition
OBSERVATIONAL DESIGN: Experimental Design
A. Descriptive
B. NO Random Assignment
C. Static IVs
D. NO Control Group
E. Qualitative Statistics |
|
|
Term
Name 3 subtypes of the SINGLE SUBJECT Design. |
|
Definition
1. ABAB
2. Multiple Baseline
3. Changing Criterion |
|
|
Term
SINGLE SUBJECT Design: Experimental Design
A. Descriptive of Causal?
B. Random Assignment?
C. Static or Manipulated IVs?
D. Control Group?
E. Statistics Used |
|
Definition
SINGLE SUBJECT Design:
A. Descriptive
B. NO Random Assignment
C. Static IVs
D. NO Control Group
E. Statistics Used:
a. Correlation
b. Qualitative |
|
|
Term
|
Definition
aka "generalizability"
The extent to which the results can be generalized beyond the conditions of the research to other populations, settings, or conditions |
|
|
Term
|
Definition
aka "confounding effect" "contingency effect" "joint effect" "moderating effect"
- The joint effect of 2+ IVs on a DV
- Occurs when IVs not only have separate effects, but also have combined effects that are different from the simple sum of their separate effects
- Occurs when the relation b/n two variables differ, depending on the value of another variable
- Presence of a statistically significant interaction effect makes it difficult to interpret main effects
**May be ordinal or disordinal |
|
|
Term
First-order interaction
Second-order interaction |
|
Definition
1. 2 variables interact
2. 3 variables interact
**May be ordinal or disordinal |
|
|
Term
|
Definition
When Txs or IVs interact with attributes of the people being studied (such as age or sex) |
|
|
Term
|
Definition
- Overall or average effects of the variable, obtained by combining the entire set of component experiments involving that factor
- Simple effect of an IV on a DV
- The effect of an IV uninfluenced by (without controlling for the effects of) other variables
- Difficult to interpret Main Effects in the presence of interaction effects
- Used in contrast with the interaction effect of 2+ IVs on a DV
|
|
|
Term
|
Definition
- Attempts to explain, predict, and explore specific relations
- A tentative answer to a RS Q
- Represent "if-then" statements about a particular phenomenon
-"If” is the IV which is manipulated in some ways
-“Then” is the DV or the resulting data |
|
|
Term
|
Definition
aka "Research hypothesis"
Hypoth that does NOT conform to the one being tested, usually the opposite of the Null hypoth
Rejecting the Null Hypoth shows that the Altern Hypoth may be true |
|
|
Term
|
Definition
- Does NOT necessarily refer to 0 or no difference --> Refers to the hypothesis to be nullified or rejected
- Null Hypoth = Core idea in hypothesis testing
- RSer usually hopes to reject Null
- Finding evidence to reject the Null Hypoth increases confidence in the probability that the Altern Hypoth is True
|
|
|
Term
|
Definition
- Problem to be investigated in a study stated in the form of a Q
- Essential for focusing the investigation at all stages, from the gathering through the analysis of evidence
- Usually more exploratory than a RS hypothesis or a Null Hypothesis
|
|
|
Term
|
Definition
- Used to assess the statistical significance of findings
- Involves comparing empirically observed sample findings with theoretically expected findings - expected if the Null hypothesis is true
-This comparison allows one to compute the probability that the observed outcomes could have been due to chance or random error |
|
|
Term
|
Definition
IF: IV which is manipulated in some way
THEN: The DC or resultant data |
|
|
Term
Scientific/empirical RS is known as: |
|
Definition
Iterative processes (require RSers to continuously run data)
Need to gain support or reject it through multiple studies BC cannot prove models |
|
|
Term
Explain the Hypothesis Chart. |
|
Definition
Null True Null False Null True Correct Type I Error Alter True Type II Error Correct |
|
|
Term
Are the variables related when the Alternative Hypothesis is true? |
|
Definition
Null: Not related
Alternative: Related |
|
|
Term
|
Definition
aka "Beta Error" or "False Negative"
Error made by wrongly retaining (or accepting or failing to reject) a False Null hypothesis (Saying that they aren't related when they are) |
|
|
Term
|
Definition
aka "Alpha Error" or "False Positive"
|
|
|
Term
|
Definition
- Anything that can be measured or assigned a number, such as unemployment rate, religious affiliation, experimental Tx, etc.
- Opposite of a variable is a constant
|
|
|
Term
|
Definition
aka "predictor variable" or "explanatory variable"
- Manipulated by the experimenter who predicts that the manipulation will have an effect on another variable (the DV)
- Can be used to predict or explain the values of another variable
|
|
|
Term
What are 3 examples of IVs? |
|
Definition
1. Environmental/Situational: Varying what is done to, with, or by the subject (e.g. a task is provided to some but not to others - Mindfulness Meditation)
2. Instructional: Variations in what participants are told or led to believe through verbal or written statements, usually aimed at altering the perception of the situation (e.g. Different interpretations of tests, based on knowledge about subjects)
3. Organismic: "Static" Cannot be manipulated so subjects canNOT be randomly assigned to these conditions (e.g. Gender, age, year in school) |
|
|
Term
|
Definition
aka "outcome" "criterion" and "response" variable
- Variable whose values are predicted by the IV, whether or not they are caused by it
- Presumed effect in a study (so called BC it 'depends' on another variable -IV = Presumed Cause
|
|
|
Term
|
Definition
aka "discrete" or "nominal"
Variable that distinguishes among subjects by sorting them into a limited number of categories, indicating type or kind (e.g religion: Christian, Buddhist, Catholic; breaking a continues variable, such as age may also be done) |
|
|
Term
|
Definition
- Variable that can be expressed by a large (often infinate) number of measures
- A variable that can be measured on an interval or ratio scale
- ALL Continuous variables are interval or ratio, but all Interval or Ratio scales are NOT continous
- Usually used when there are FEW ranks in the data (Ordinal used when there are MANY)
|
|
|
Term
|
Definition
aka "stochastic variable"
- A variable that varies in ways the researcher does NOT control
- Variable whose values are randomly determined -"random" refers to the way the events, values, or subjects are chosen or occur, NOT to the variable itself -e.g. Men and Women are NOT random, but sex COULD be random
|
|
|
Term
IV is ____. DV is ____. Control Variables/Covariates are _____. |
|
Definition
-Manipulated
-Observed
-Held constant |
|
|
Term
List 7 Types of Control (Controlling Sources of Error) |
|
Definition
1. Statistical Methods of control (ANCOVA)
2. Holding Constant
3. Matching
4. Blocking (Block Design; Randomized-Blocks Design)
5. Counterbalancing
6. Double Blind
7. Control group
a. No Tx Control
b. Wait-List Control
c. Attention Placebo Control
d. Yoked-Control
e. Patched-Up Control |
|
|
Term
Describe Statistical Methods of Control: ANCOVA |
|
Definition
-
Allows you to remove covariates from the list of possible explanations of variance in the DV by using stats (e.g. regression) to partial out of the effects of covariates, rather than direct experimental methods to control extraneous variables (e.g. pretest scores are used as covariates in pre/posttest experimental designs)
|
|
|
Term
Describe Type of Control: Holding Constant |
|
Definition
Assign the variable an average value to subtract the effects of a variable from a complex relationship so as to study what the relationship would be if the variable were in fact a constant
-e.g. Education of managers in a study |
|
|
Term
Describe Type of Control: Matching |
|
Definition
aka "subject matching"
- RS Design in which subjects are matched on characteristics that might affect their rxn to a Tx
- After pairs are determined, one member of each pair is assigned at random to the group receiving Tx (experimental group); The other group (Control Group) does not receive Tx -
- W/out Random Assignment, matching is not considered good RS practice
|
|
|
Term
Describe Blocking WRT Control |
|
Definition
- Block Design: Subjects are grouped into categories or "blocks," which are treated as the unit of analysis
-Goal: Control for a covariate
- Randomized-Blocks Design: Subjects are marched on a variable the RSer wishes to control
-Blks are used to ensure each group has subjects with a similar status
-Subjects are put into groups (blocks) the same size as the # of Txs
-Members of each block are assigned randomly to differnt Tx groups |
|
|
Term
Describe Counterbalancing Technique WRT Controlling Sources of Error |
|
Definition
In a within-subjects factorial experiment, presenting conditions (Txs) in all possible orders to avoid order effects (e.g. rotating conditions so that subjects experience them in all possible orders - effect of lighting on reading) |
|
|
Term
Describe Double Blind Technique WRT Controlling Sources of Error |
|
Definition
A means of reducing bias in an experiment by insuring that both those who administer a Tx and those who receive it do not know which subjects are in the experimental and control groups |
|
|
Term
Describe Control Groups WRT Controlling for Sources of Error |
|
Definition
- A group that does NOT receive the Tx so it can be compared to the Experimental Group
- Used to address threats to Internal Validity (e.g. Hx, maturation, selection, testing, etc.)
- Ethical Considerations: problems associated with witholding Tx
- No-Tx Control Group
- Wait-List Control
- Attention-Placebo Control
- Yoked-Control
- Patched-up Control
|
|
|
Term
Describe Wait-List Control Groups WRT Controlling for Sources of Error |
|
Definition
- Like a no-Tx control, but Tx is only withheld temporarily
- Period of time that Tx is withheld usually corresponds to the pre to post test ass't interval --> As soon as the second ass't battery is administered, the waitlist subjects receive their Tx
- BC subjects in wait-list controls receive Tx after the post-test period, long-term followup is not possible since the control is no longer available for comparisons Group Test Tx Test Tx Exper X X X X Control X X X (receives Tx later)
|
|
|
Term
Describe Yoked-Control Groups WRT Controlling for Sources of Error |
|
Definition
- Used when the procedure changes for each subject based on their performance; Pairs are arbitrarily formed so that the subject in the experimental grp and the yoked-control subject receive same # of sessions/trials/etc.
- Pairs are formed arbitrarily (unless matching was used to assign to grps) so that the subjects in the experimental grp receive & yoked-control grp receive same # of sessions
- Purpose is to ensure that groups are equal with respect to potentially important but conceputally and procedurally irrelevant factors
Exper Grp Sesh's Control Grp Sesh's Subject1 5 Subject2 5 Subject3 7 Subject4 7 Subject5 3 Subject6 3 |
|
|
Term
Describe Patched-Up Control Groups WRT Controlling for Sources of Error |
|
Definition
Groups that are added to an experiment that utilize subjects who were not part of the original subject pool and NOT randomly assigned to Tx |
|
|
Term
What is Sampling Distribution (of a Statistic)? |
|
Definition
A theoretical frequency distribution of the scores for or values of a statistic, such as a mean
ALL stats that can be computed for a sample has a sampling distribution that is the distribution of statistics that WOULD BE produced in a repeated random sampling (with replacement) from the same population
Composed of all possible values of a stat and their probabilities of occurring for a sample of a particular size
Inferential stats depends on sampling distribution
Used to calc the probability that sample stats couild have occurred by chance, and thus to determine whether something that is true of a sample statistic is also likely to be true of a pop parameter
Samp distrib is a distrib used as a model of what would happen if:
a. Null hypoth were true (there really were no effects) AND
b. Experiment was repeated an infinate # of times
-Created using Monte Carlo experiments: A lrg # of equal sized random samples are drawn from a pop you wish to represent
-Stat is computed for each sample and arranged in a freq distrib so you can see the normal curve for that pop -Repeating multiple times --> pop sampling distribution -Any generating of random values in order to study stat models Construction: Assume an infinate # of samples of a given size have been drawn from a pop & distributions recorded; then stat (e.g. mean) is computed for the scores of each hypothetical sample; then arranged in a distribution to arrive at the sampling distribution; sam distrib is compared with the actual sample stat to determine if that stat is or is not likely to be the size it is due to chance |
|
|
Term
If statisticians who develop a new theory want to test it on data, what should they do? |
|
Definition
Use Monte Carlo Experiments: test the theory against lrg # of equal sized random samples drawn from the pop (most frequently by a computer) |
|
|
Term
|
Definition
Error that ALWAYS occurs BC a sample is drawn from a pop BC only that part of the pop is measured, leaving out those who are not measured
Error decreases as sample size increases
If pop from which sample is drawn is lrg, pop values will NOT be affected much by 1 or 2 samples who are extreme |
|
|
Term
Sampling error of a ___ value estimated from a sample size is equal to the ____. Therefore, it is ___. |
|
Definition
- Estimated standard deviation of the variable divided by the square root of the sample size
-Sampling Error = SD/ sqrt(n)
- Not dependent upon the population size, but only on the variability of the variable of concern and sample size
|
|
|
Term
How do you decrease sampling error? |
|
Definition
|
|
Term
If one drew a sample of four observations from a large population, the sampling error would be equal to _____. How would you halve the sampling error? |
|
Definition
The SD divided by 2 (the square root of four)
Increase the sample size to 16; halve it again by increasing SS to 64 |
|
|
Term
How do you calculate Sampling Error? |
|
Definition
|
|
Term
What is the difference b/n Random Selection & Random Sampling? What are they? |
|
Definition
aka "simple random sampling/equal probability sampling" when no qualifiers are used --> Reduces bias
|
|
|
Term
Define Sampling. List 9 types of sampling. |
|
Definition
- Selecting elements (subjects or other unit of analysis) from a pop in such a way that they are representative of the population
- Purpose: to increase the likelihood of being able to generalize accurately about the population
- Sampling is often a more accurate and efficient way to learn about a lrg pop than a census of a whole population
- Convenince Sample
- Snowball Sample
- Stratified Random Sample
- Proportional Stratified Random Sample
- Cluster Sample
- Probability Sample
- Quota Sample
- Accidental Sample
- Purposive Sample
|
|
|
Term
|
Definition
Sample of subjects selected for a study not BC they are representative, but BC it is convenient to use them |
|
|
Term
|
Definition
aka "networking sample" "word of mouth"
One subject gives the RSer the name of another subject, who in turn provides the name of a third, and so on -Useful when ppl w/similar experiences are needed |
|
|
Term
|
Definition
The population as a whole is separated into distinct parts ("strata") and a random or probability sample is drawn from particular categories (or "strata") of the population being studied
- Ensures that the full population is properly represented
- Fxn similar to blocks & randomized blks designs -can be proportionate so that the size of the strata corresponds to the size of the grps in the pop (can also be disproportionate)
- Used in political polling
|
|
|
Term
Proportional Stratified Random Sampling |
|
Definition
A stratified random sample in which the proportion of the subjects in each category (stratum) is the same as in the population |
|
|
Term
|
Definition
Selecting units (clusters) of individuals rather than individuals themselves, then randomly selecting subjects from those units
A method for drawing a sample from a pop:
1. In 2+ stages
2. When it is not possible to identify or obtain access to entire population
3. When random sample would produce a list of widely scattered subjects and NOT be cost-efficient
•Ex: randomly selecting psyc hospitals and then randomly selecting sx from them
- Want clusters to be as diverse as possible (CONTRASTs stratified sampling - subjs as similar as possible)
- DISADVAN: Each stage of the process increases sampling error --> Margin of error larger in cluster sampling than in simple or stratified random sampling (but may be compensated by increasing sample size BC method is cost-effective)
|
|
|
Term
|
Definition
Each case that could be chosen has a known probability of being included in the sample
ALWAYS uses Random Selection
Often a random sample, which is an equal probability sample |
|
|
Term
|
Definition
A stratified NONrandom sample (a sample selected by dividing a pop into categories & selecting a certain # [a quota] of respondents from each category)
- Indiv. cases within each category are not selected randomly - usually chosen on basis of convenience
- NOT a reliable method for making inferences about a population
|
|
|
Term
|
Definition
Sample gathered haphazardly (e.g. by interviewing the first 100 ppl you ran into that are willing to talk to you
- DISADVAN: The RSer has no way of knowing what the pop might be
|
|
|
Term
|
Definition
Sample composed of subjects selected deliberately (on purpose) by RSers, usually BC they think certain characteristics are typical or representative of the population
ADVAN:
-
Increase representativeness in RS
DISADVAN:
- Assumes the RSer knows in advance what the relevant characteristics are
- May introduce unknown bias
- NOT random
|
|
|
Term
|
Definition
aka "random allocation"
Individual in each grp is assigned ENTIRELY by chance with equal probability of being placed in each grp
Goals:
-
Distribute characteristics of a sample among grps (eg age, sex, etc) that, if left uncontrolled, might interfere with interpretation of the grp diffrenences
-
Group Equivalence: (doesn't necessarily happen) - When sample sizes are small, grp equiv is less likely & power of test is attenuated (lowered)
|
|
|
Term
|
Definition
- Uused when a subject variable is known to be related to scores on the DV
- Grp subjects together based on similarity on the variable in question; then randomly assign one person from each pair to the experimental grp
|
|
|
Term
|
Definition
Statistical proposition that the larger a sample size, the more closely a sampling distribution of a statistic will approach a normal distribution (true even if the pop from which the sample is drawn is not normally distributed)
- Explains why sampling error is smaller within a larger sample than with a small sample & why a normal distribution can be used to study a wide variety of statistical problems
|
|
|
Term
Criteria & criterion measures |
|
Definition
Another term for DV
- Criterion used in correlational RS when it is not possible to establish a causal relationship (it is like the outcome of the study) b/n a DV and IV
|
|
|
Term
|
Definition
- Measures what it is supposed to measure
- Requires RELIABILITY, but reverse not true
- Extent to which a measure is free of systematic error Refers to designs that help RSers gather data appropriate for answering their Qs
|
|
|
Term
|
Definition
|
|
Term
What is the goal of random assignment? |
|
Definition
1. To distribute characteristics of a sample among groups (i.e. age, sex, etc) that, if left uncontrolled, might interfere with interpretation of the group differences
2. Group equivalence (Although doesn't necessarily happen - Small sample size --> Lower Power [Attenuation]) |
|
|
Term
Statistical Conclusion Validity |
|
Definition
1. Whether the presumed cause and effect covary
a. Can incorrectly conclude that cause and effect covary when they do not (TI Error) or incorrectly conclude that they do not covary when they do (TII error)
b. Can overestimate or underestimate the magnitude of covariation, as well as the degree of confidence that magnitude estimate warrants
2. How strongly they covary |
|
|
Term
|
Definition
"TYPE I Error"
The probability of rejecting a hypothesis (the null) when that hypothesis is true |
|
|
Term
|
Definition
"TYPE II Error"
The probability of accepting the null hypothesis when it is fale |
|
|
Term
|
Definition
(1 - Beta)
The probability of detecting real differences between conditions |
|
|
Term
|
Definition
ES = m1-m2/s
- A way of expressing the differences b/n groups, treatments, or conditions
- The magnitude of the difference b/n 2+ conditions expressed in standard deviation units
- Difference b/n the means and the pooled variance -Useful in metaanalysis
|
|
|
Term
Threats to Statistical Conclusion Validity |
|
Definition
**Reasons why inferences about covariation b/n 2 variables may be incorrect
1) Low statistical Power
2)Violated Assumptions of Statistical Tests
3) Fishing and the Error Rate Problem
4) Unreliabilit of measures
5) Restriction of Range
6) Unreliability of Tx Implementation
7) Extraneous variance in the experiemental setting
8) Heterogeneity of units
9) Inaccurage Effect Size Estimation |
|
|
Term
|
Definition
**Threat to Statistical Conclusion Validity
An insufficiently powered experiment may incorrectly conclude that the relationship b/n tx and outcome is not significant |
|
|
Term
Violated Assumptions of Statistical Tests |
|
Definition
**Threat to Statistical Conclusion Validity
Violations of statistical test assumptions can lead to either overestimating or underestimating the size and significance of an effect |
|
|
Term
Fishing and the Error Rate Problem |
|
Definition
**Threat to Statistical Conclusion Validity
Repeated tests for significant relationships, if uncorrected for the number of tests, can artifically inflate the statistical significance |
|
|
Term
Unreliability of Measures |
|
Definition
**Threat to Statistical Conclusion Validity
Measurement error weakens the relationship b/n two variables and strengthens or weakens the relationships among three or more variables |
|
|
Term
Unreliability of Tx Implementation |
|
Definition
**Threat to Statistical Conclusion Validity
If a Tx that is intended to be implemented in a standardized manner is implemented only partially for some respondents, effects may be underestimated compared with full implementation |
|
|
Term
Extraneous Variance in Experimental Setting |
|
Definition
**Threat to Statistical Conclusion Validity
Some features of an experimental setting may inflate error, making detection of an effect more difficult |
|
|
Term
|
Definition
**Threat to Statistical Conclusion Validity
Increased variability on the outcome variable within conditions increases error variance, making detection of a relationship more difficult |
|
|
Term
Inaccurate Effect Size Estimation |
|
Definition
**Threat to Statistical Conclusion Validity
Some statistics systematically overestimate or underestimate the size of an effect |
|
|
Term
|
Definition
- The extent to which the results of a study can be attributed to the txs, rather than to the flaws in the RS design
- Degree to which one can draw valid conclusions about the causal effects of one variable or another
- Depends on the extent to which extraneous variables have been controlled by the RSer
|
|
|
Term
Threats to Internal Validity |
|
Definition
1)Ambiguous Temporal Precedence
2) Selection
3) History
4) Maturation
5) Regression
6) Attrition
7) Testing
8) Instrumentation
9) Additive and Interact effects of threats to Internal Validity |
|
|
Term
Ambiguous Temporal Precedence |
|
Definition
**Threat to Internal Validity
Lack of clarity about which variable occurred first may yield confusion about which variable is the cause and which is the effect |
|
|
Term
|
Definition
**Threat to Internal Validity
Systematic differences over conditions in respondent characteristics that could also cause the observed effect |
|
|
Term
|
Definition
**Threat to Internal Validity
Events occurring concurrently with tx could cause the observed effect |
|
|
Term
|
Definition
**Threat to Internal Validity
Naturally occurring changes over time could be confused with a Tx effect (eg growing older and growing wiser) |
|
|
Term
|
Definition
**Threat to Internal Validity
When units are selected for their extreme scores, they will often have less extreme scores on other variables, an occurrence that can be confused with a Tx effect
-eg. ppl who come to psychotherapy when they are extremely distressed are likely to be less distressed on subsequent occassions, even if the psychotherapy had no effect ---> Phenomenon called "Regression to the mean" |
|
|
Term
|
Definition
**Threat to Internal Validity
-aka "mortality"
Loss of respondents to Tx or to meas't can produce artifactual effects if that loss is systematically correlated with conditions |
|
|
Term
|
Definition
**Threat to Internal Validity
Exposure to a test can affect scores on subsequent exposures to that test, an occurrance that can be confused with a Tx effect
-Practice, familiarity, or other forms of reactivity are relevant mechanisms and could be mistaken for Tx effects |
|
|
Term
|
Definition
**Threat to Internal Validity
The nature of a measure may change over time or conditions in a way that could eb confused with a Tx effect
-eg. a change in a measuring instrument can occur over time even in the absence of Tx, mimicking a Tx effect (the spring on a bar press might become weaker and easier to push over time artifactually increasing reaction times) |
|
|
Term
Additive and Interact Effects of Threats to Internal Validity |
|
Definition
- The impact of a threat can be added to that of another threat or may depend on the level of another threat
- Validity threats do NOT operate singly -->several can operate simultaneously: Net bias depends on the direction and magnitude of each individual bias plus whether they combine additively or multiplicatively (interactively)
-eg. Selection-maturation additive effect may result when nonequivalent experimental groups formed at the start of the Tx are also maturing at different rates over time
-A selection-hx additive effect may result if nonequivalent groups also come form different settings and each group experiences a unique local hx |
|
|
Term
|
Definition
Extent to which the RS Design or strategy allows for a clear interpretation of the basis of the relationship among variables of interest
|
|
|
Term
Confounds of Construct Validity |
|
Definition
1. Reasons why inferences about the constructs that characterize study operations may be incorrect
2. Sources of secondary variance
3. Features within an experiment that interfere with interpretation and create alternative explanations for results which are different from the theoretical assumptions about the action of the IV or the presumed agents of change |
|
|
Term
Threats to Construct Validity |
|
Definition
1. Inadequate Explication of Constructs
2. Construct Confounding
3. Attention and/or Simple Contact with Participants (Hawthorne Effect)
4. SIngle Operations and Narrow Stimulus Sampling
5. Experimenter Expectancy Effects
6. Demand Characteristics or Cues in the Experimental setting |
|
|
Term
Inadequate Explication of Constructs |
|
Definition
**Threat to Construct Validity
Failure to adequately explicate a construct may lead to incorrect inferences about the relationship b/n operation and construct |
|
|
Term
|
Definition
**Threat to Construct Validity
Operations usually involve more than one construct; Failure to describe all the constructs may result in incomplete construct inferences |
|
|
Term
Attention and/or Simple Contact with Participants |
|
Definition
**Threat to Construct Validity
aka "Hawthorne Effect"
The intervention may impact participatnts simply because of the attention or human contact it provided, rather than the specific content or characteristics of the intervention
- Controlled for by the use of Attention Control --> Giving the group attention but not the intervention
|
|
|
Term
Single Operations and Narrow Stimulus Sampling |
|
Definition
**Threat to Construct Validity
The way in which the intervention, tx, or program is operationalized or delivered may limit the RSer's ability to examoine why the intervention affected the participants
-eg. Holding constant on the therapist - but the therapist might be more adept or confident with one intervention than itself --> Therefore the results are due to the therapist, not the Tx |
|
|
Term
Experimenter Expectancy Effects |
|
Definition
**Threat to Construct Validity
The extent to which it is possible that the RS'ers beliefs, ideas, hopes, opinions, and hypotheses inadvertently affected participants responses
|
|
|
Term
Demand Characteristics or Cues in the Experimental Setting |
|
Definition
**Threat to Construct Validity
The extent to which extraneous cues in the intervention or experimental procedure may account for the results, rather than the intervention itself |
|
|
Term
|
Definition
aka "generalizability"
The extent to which the findings of a study are relevant to subjects and settings beyong those in the study |
|
|
Term
Threats to External Validity |
|
Definition
Reasons why interences about how study results would hold over variations in persons, settings, txs, and outcomes may be incorrect
1. Inadequate explication of constructs
2. Construct confounding
3. Sample characteristics
4. Stimulus characteristics and settings
5. Reactivity of experimental arrangement
6. Multiple txs
7. Novelty Effects
8. Experimenter Expectancy
9. Reactivity of Ass't
10. Test Sensitization
11. Timing of Measurement |
|
|
Term
Inadequate explication of constructs |
|
Definition
**Threat to External Validity
Failure to adequately explicate a construct may lead to incorrect inferences about the relationship b/n the operation and construct |
|
|
Term
|
Definition
**Threat to External Validity
Operations usually involve more than one construct, and failure to describe all the constructs may result in incomplete construct inferences |
|
|
Term
|
Definition
**Threat to External Validity
The degree to which the results of the research may be generalized to others who vary from the particular characteristics of the selected sample
-eg. Demographics, age, gender, religion, ethnicity, nationality, ability, or disability, SES, or geography |
|
|
Term
Stimulus Characteristics & Settings |
|
Definition
**Threat to External Validity
The degree to which the conditions in the natural RS setting may impact the results in ways which may not generalize to situations or persons who are not involved in an experiment |
|
|
Term
Reactivity of Experimental Arrangement |
|
Definition
**Threat to External Validity
Participant awareness that they are participating in an experiment may impact the results in ways which may not generalize to situations persons who are not involved in an experiment |
|
|
Term
|
Definition
**Threat to External Validity
When participants receive more than one experimental condition or tx, the results may not generalize to situations where only a single tx is given |
|
|
Term
|
Definition
**Threat to External Validity
The possibility that effects of an intervention are in part due to the novelty of the administration circumstances |
|
|
Term
|
Definition
**Threat to External Validity
The unintentional effect that the experimenter exerts on the study in the direction of the hypothesis |
|
|
Term
|
Definition
**Threat to External Validity
Participants may respond to tests or ass't measures differently when they are aware that they are being assessed, than if they are not aware --> Participants may not be aware of such meas't or ass't in other non-experimental situations --> the results would perhaps not be generalizable |
|
|
Term
|
Definition
**Threat to External Validity
- The effect produced by pretesting participants
- May make them more or less attentive or receptive to the intervention and limit generalizability
|
|
|
Term
|
Definition
**Threat to External Validity
The degree to which the results of the intervention or RS project vary as a result of the point in time which the post-intervention is given |
|
|
Term
|
Definition
aka "pre-experimental design"
-
Descriptive (Qualitative)
-
NO random assignment
-
Poor External Validity/Generalizability
-
Investigators observe subjects, but do NOT interact with them
*SUBTYPES:
1. Case Study
2. Case Control
3. Cross-Sectional
4. Retrospective Cross Sectional
5. Cohort Design : Used to collect preliminary information that may lead to specific IVs, DVs, and hypotheses about relationships |
|
|
Term
|
Definition
**Observational Design
An in depth study of a single individual, organization, family, or other social unit
ADVAN: Allows more intensive analysis of specific empirical details
DISADVAN:
1. Hard to use results to generalize to other cases
2. Purely descriptive |
|
|
Term
|
Definition
**Observational Design
Method of sampling cases with and without that outcome and studying their backgrounds
-eg. In a study of lung cancer, the cases are individuals who have the disease; the controls are similar people who do not have it; The background of those with and without the disease are compared to understand the origins of the disease |
|
|
Term
|
Definition
**Observational Design
A study conducted at a single point in time by taking a "slice" (cross section) of a population at a particular time
INDIRECT evidence about the effects of time ONLY
Caution when drawing conclusions about change (eg. Just BC older age group is more prejudiced than a younger age group does NOT mean that the younger grp will become more prejudiced as they grow older |
|
|
Term
Retrospective Cross-Sectional |
|
Definition
**Observational Design
Draw inferences about an antecedent event that results in/is associated with an outcome
Observe the past to see what determined present-day outcomes
Attempts to ID the timeline b/n possible cause or antecedents (risk factors) and subsequent outcome of interest
Subjects ID'd who already show the outcome of interest (cases) and compared to those who do not show the outcome (control)
-eg. Rel. of attachment patterns to suicidal bx among adolescents |
|
|
Term
|
Definition
**Observational Design
aka "Cohort analysis" or "Prospective Longitudinal Study"
Study same cohort over time - Strength of design lies in establishing the relations b/n antecedent events and outcomes
Cohort: Group of indiv having a statistical factor (usually age) in common
PROBLEM: May confound results of cross-sectional as diff. are more due to the specific cohort and not the DV you are interested in
Difference from Case-Control: Cohort designs follow sample over time to ID factors leading to an outcome of interest and the grp is assessed before the outcome has occurred
a. Single Cohort: One grp who meets particular criteria are selected are followed over time in order to study the emergence of the outcome
b. Multiple-Group cohort: 2+ groups are ID'd and followed over time to examine outcomes of interest
c. Accelerated Cohort: The study of multiple grps - cohorts who vary in age are included - ea grp covers only a portion of the total time frame of interest and the grps overlap in ways tha allow the investigator to discuss the entire developmental period; b/n longitudinal and cross-sec desigs; Requires less time than if a single grp were studied - |
|
|
Term
|
Definition
REQUIREMENTS:
1. Random Assignment
2. IVs must be Manipulated
a. Manipulated: Differences chosen by researcher
b. Static: RSer cannot assign to subjects (age, gender, sex orient)
3. Control Group: Grp that experiences all the things as the experimental grp but does not receive tx (eg. Hx, Maturation, demographics, selection, testing, etc); Accounts for spontaneous remission cases which affect internal validity
STRENGTH: Internal validity (due to Random Ass't) - more certain about attributing cause to the IV -->Greatest control of IVs and Error
WEAKNESS: External Validity - may be inappropriate to generalize results beyond lab
|
|
|
Term
List the Subtypes of Experimental Design |
|
Definition
1. Pretest-Posttest Control Group Design
2. Posttest Only Control Group Design
3. Solmon 4 Group
4. Factorial Design |
|
|
Term
Pretest-Posttest Control Group Design |
|
Definition
|
|
Term
Posttest Only Control Group Design |
|
Definition
- At least 2 levels of IV with 1 receiving Tx
- Only a postest is given for pretest sensitization, however we cannot assure that our groups started out equivalent
RA Observ Tx Postttest R X O R O |
|
|
Term
|
Definition
Group RA Observ Tx Posttest 1 R O X O 2 R O O 3 R X O 4 R O
- **G2 v. G4: Assesses the effect of pretest on internal validity (neither grp got Tx)
- G1 v. G3: Assesses effect of pretest on external validity (both grps got Tx)
|
|
|
Term
|
Definition
|
|
Term
|
Definition
a. MAIN EFFECTS of each IV
b. INTERACTION b/n 2 variables
RA Obser Tx F/U R Xa1b1 O R Xa1b2 O R Xa2b1 O R Xa2b2 O
1. It can assess the effects of separate variables
2. Different variables can be studied with fewer subjects
3. Provides unique info about the combined effects of IVs
4. Ixns provide important info such as whether there may be variables that moderate the effects of other variables
WEAKNESSES:
1. # of grps multiplies quickly as new factors or new levels are added
2. Optimally informative when an investigator predicts an interactive relationship among variables, but with more variables, interactions become complicated and difficult to interpret |
|
|
Term
Quasi-Experimental Design |
|
Definition
aka "Combo design"
RERQUIREMENTS:
1. At least one IV manipulatable
2. One IV is static: True Random Assignment is NOT possible
3. Must meet req'ts for causal relationships:
a. Cause precede effect
b. Cause covaries with effect
c. Alternative explanations for causal rel. are implausible
|
|
|
Term
Subtype of Experimental designs: Within-Subjects |
|
Definition
Increase statistical power by controlling individual differences b/n units within conditions; Use fewer units to test the same # of Tx
--> DRAWBACKS:
1. Fatigue effects
2. Practice effects
3. Carryover effects
4. Order effects
SOLUTION: Counterbalancing: Some units get Tx in on order, and others get it in another order so that order effects can be assessed |
|
|
Term
|
Definition
aka "Sequence effects" "Multiple Tx interference" "Carryover Effects"
The influence of order on subject responses in which subjects receive Tx (within-subject design)
PROBLEM: Confound Tx effects
SOLUTION: Counterbalancing -eg. Survey: Order of Qs |
|
|
Term
|
Definition
Designs that try to balance the order of Txs across subjects
In a within-in subjects factorial experiment, presenting conditions (Txs) in all possible orders to avoid order effects
eg. Latin Square |
|
|
Term
|
Definition
Partway through the experiment (usually midway), all subjects cross over (are switched) to another experimental condition
- Both groups get both control and experimental conditions
- Increases POWER BC each group serves as its own control
|
|
|
Term
Multiple Tx Counterbalanced |
|
Definition
Controls for carryover (order) effects that may result with a within-subjects design
Administer levels of an IV to different subjects or groups in a different order (Balances the order of Txs) |
|
|
Term
|
Definition
A method of allocating subjects, in a within-subjects experiment, to Tx group orders -
- Number of rows and columns MUST be equal
|
|
|
Term
|
Definition
Describes what happens when many subjects in a study have scores on a variable that are at or near the possible upper limit
|
|
|
Term
|
Definition
Describes a situation in which many subjects in a study measure at or near the possible lower limit
- Makes analysis difficult BC it reduces the amount of variation in the variable
|
|
|
Term
|
Definition
aka "Single case design"
Compare the effects of different Tx conditions on performance of one individual over time
REQUIREMENTS:
1. Baseline Assessment:
a. Observe Bx for a period of time prior to the intervention to predict the level of performance
b. Data must be stable (absence of a slope)
2. Continuous Ass't: Performance is observed on several occasions prior to the intervention, then continuously observed during the period of time the intervention is in effect
3. Examination of Trends/Slpoe: Tendency for performance to decrease or increase systematically or consistently over time |
|
|
Term
Which design relates to Clinical Practice? |
|
Definition
Single-subject Design/Within-subject design (can be an indiv., family, organization, community, group)
CONTRAST TO GROUP EXPERIMENTAL DESIGN: B/n subject designs (Participant is either in the Tx or control grp) |
|
|
Term
Subtypes of Single-subjects designs |
|
Definition
1. ABAB
2. Multiple Baseline
3. Changing Criterion |
|
|
Term
|
Definition
aka "Withdrawal" or "Reversal Design"
Single-Subject Design
- Alternate baseline measures of a variable with measures of that variable after a Tx
- NO Control grp -"A": Baseline (Control); "B": Tx (Intervention)
- ABAB used when unethical to withold a Tx from a control group
- Extraneous EVENTS are much better controlled when there are several shifts b/n baseline and intervention phases
1. Carry-over Effects
2. Order Effects
3. Irreversibility of Effects
4. Ethical Problems
5. Feasibility Problems |
|
|
Term
|
Definition
Demonstrates effects of an intervention by showing that bx change accompanies the intro of the intervention at different pts in time
|
|
|
Term
|
Definition
Demonstrates the effect of an intervention by showing that bx changes incrementally to match a performance criterion (expected goal/ourcome)
A --> B --> Criterion Reached and Reward Administered --> Set New Goal, Return to A --> B --> Goal reached and Reward Given... |
|
|
Term
Similarities b/n Single-Subject and Traditional Group Designs |
|
Definition
Both are longitudinal and concerned with:
1. Issues of control
2. Specifying targets of intervention in operational terms
3. Developing measurement and recording plans for assessing these targets
4. Can use a combination of procress and outcome measures, though single subject designs rarely employ process measures, which try to assess the "black box" of Tx or what kinds of interactions go on b/n clients and therapists during the course of an intervention |
|
|
Term
Differences b/n Single-Subject and Traditional Group Designs |
|
Definition
1. Single-Subject designs typically use more repeated measures
2. Duration of RS is more variable
3. Participants are more actively involved in setting the goals and targets of interventions
4. The choice of the design is typically established by the worker rather than the RSer
5. Designs are more flexible, responding to the needs of the particular case rather than fixed
6. Findings have more direct and immediate impact on interventions at the individual case level
7. Less costly than group |
|
|
Term
|
Definition
Designs conducted to get an overview or a review of all literature in a specific area through the evaluation and combination of results from multiple studies
- GOAL: Provide estimate of the Effect Size for the particular area of RS
|
|
|
Term
RS Ethics: Institutional Approval |
|
Definition
Psychologists provide accurate info about their RS proposals and obtain approval prior to conducting the RS
|
|
|
Term
RS Ethics: Informed Consent to RS |
|
Definition
Psychologists inform participants about:
1. The purpose of the RS, expected duration, and procedures
2. Their right to decline to participate and to withdraw from the RS once participation has begun
3. The foreseeable consequences of declining or withdrawing
4. Reasonably foreseeable factors that may be expected to influence their willingness to participate such as potential risks, discomfort, or adverse effects
5. Any prospective RS benefits
6. Limits of confidentiality
7. Incentives for participation
8. Whom to contact for questions about the RS and RS participants' rights |
|
|
Term
RS Ethics: RS involving the use of experimental Tx should be clarified to participants at the outset of RS |
|
Definition
1. The experimental nature of the Tx
2. Services that will or will not be available to the control group(s) if appropriate
3. Means by which assignment to Tx and control grps will be made
4. Available Tx alternatives if an indiv does not wish to participate in the RS or wishes to withdraw once a study has begun
5. Compensation for or monetary costs of participating including, if appropriate, whether reimbursement from the participant or a third-party payor will be sought |
|
|
Term
RS Ethics: Informed Consent for Recording Voices and Images in RS |
|
Definition
Psychologists obtain informed consent from RS participants prior to recording their voices or images for data collection unless:
1. The RS consists solely of naturalistic observations in public places, and it is not anticipated that the recording will be used in a manner that could cause personal identification or harm
2. RS design includes deception, and the consent for the use of the recording is obtained during debriefing |
|
|
Term
RS Ethics: Client/Patient, Student, and Subordinate RS Participatns |
|
Definition
Must take steps to protect the prospective participants form adverse consequences of declining or withdrawing from participation
- When RS participation is a course requirement or opportunity for Extra Credit, the prospective participant is given the choice of equitable alternative activities
|
|
|
Term
RS Ethics: Dispensing with Informed Consent for RS |
|
Definition
Psychologists may dispense with informed consent only:
1. Where RS would not reasonably be assumed to create distress or harm and involves:
a. The study of normal educational practices, curricula, or classroom management methods conducted in educational settings
b. Only anonymous questionnaires, naturalistic observations, or archival RS for which disclosure of responses would not place participants at risk for criminal or civil liability or damage their financial standing, employability, or reputation, and confidentiality is protected
c. The study of factors related to job or organization effectiveness conducted in organizational settings for which there is no risk to particiants' employability, and confidentiality is protected
2. Where otherwise permitted by law or federal or institutional regulations |
|
|
Term
RS Ethics: Offering Inducements for RS Participation |
|
Definition
Psychologists make reasonable efforts to avoid offering excessive or inappropriate financial or other inducements for RS participation when such inducements are likely to coerce participation
- When offering professional services as an inducement for RS participation, psychologists clarify the nature of the services, as well as the risks, obligations, and limitations
|
|
|
Term
|
Definition
1. Psychologists do NOT conduct a study involving deception unless they have determined that the use of deceptive techniques is justified by the study's significant prospective scientific, educational, or applied value and that effective nondeceptive alternative procedures are not feasible
2. Psychologists do NOT deceive prospective participants about RS that is reasonable expected to cause physical pain or severe emotional distress
3. Psychologists explain any deception that is an integral feature of the design and conduct of an experiment to participants as early as is feasible, preferably at the conclusion of the data collection, and permit participants to withdraw their data |
|
|
Term
|
Definition
Psychologists provide a prompt opportunity for participants to obtain appropriate info about the nature, results, and conclusions of the RS, and they take reasonable steps to correct any misconceptions that participants may have of which the psychologists are aware
- If scientific or human values justify delaying or withholding this info, psychologists take reasonable measures to reduce the risk of harm
|
|
|
Term
|
Definition
Psychologists provide a prompt opportunity for participants to obtain appropriate info about the nature, results, and conclusions of the RS, and they take reasonable steps to correct any misconceptions that participants may have of which the psychologists are aware
- If scientific or human values justify delaying or withholding this info, psychologists take reasonable measures to reduce the risk of harm
- When psychologists become aware that RS procedures have harmed a participant, they take reasonable steps to minimize the harm
|
|
|
Term
RS Ethics: Human Care and Use of Animals in RS |
|
Definition
|
|
Term
RS Ethics: Repoting RS Results |
|
Definition
Psychologists do not fabricate data
- If psychologists discover significant errors in their published data, they take reasonable steps to correct such errors in a correction, retraction, erratum, or other appropriate publication means
|
|
|
Term
|
Definition
Psychologists do not present portions of another's work or data as their own, even if the other work or data source is cited occasionally |
|
|
Term
RS Ethics: Publication Credit |
|
Definition
|
|
Term
RS Ethics: Duplicate Publication of Data |
|
Definition
Psychologists do NOT publish, as original data, data that have previously been published
|
|
|
Term
RS Ethics: Sharing RS Data for Verification |
|
Definition
After RS results are published, psychologists do not withhold the data on which their conclusions are based from other competent professionals who seek to verify the substantive claims through reanalysis and who intend to use such data only for that purpose, provided that the confidentiality of the participants can be protected and unless legal rights concerning proprietary data preclude their release
- Does NOT preclude psychologists from requiring that such individuals or grps be responsible for costs associated with the provision of such info
- Psychologists who request data from other psychologists to verify the substantive claims through reanalysis may use shared data only for the declared purpose
- Requesting psychologists obtain prior written agreement for all other uses of the data
|
|
|
Term
|
Definition
Psychologists who review material submitted for presentation, publication, grant, or RS proposal review respect the confidentiality of and the proprietary rights in such information of those who submitted it |
|
|
Term
Descriptive Statistics (Defn) |
|
Definition
Used to organize and describe the characteristics of a collection of data
1. Measures of Central Tendency (Mean, Median, Mode) 2. Measures of dispersion or variability (Range, Standard Deviation, Variance, Confidence Interval) |
|
|
Term
Measures of Central Tendency |
|
Definition
Grps of data can be summarized using averages
- Mean, Median, Mode (each provides a different type of info about a distribution of scores and is simple to compute and interpret)
|
|
|
Term
|
Definition
-aka "Typical" "average" "Most Central Score"
|
|
|
Term
|
Definition
Midpoint in a set of scores: 50% fall above, 50% fall below
- Measure of Central Tendency
|
|
|
Term
|
Definition
Most frequently occurring score
- Measure of Central Tendency
- List values in a distribution, tally # of times that values occurs
|
|
|
Term
|
Definition
- Difference b/n the minimum and maximum score Measure of dispersion/variability
- Purpose: get a general estimate of how wide or different scores are from another --> how much spread in a distribution
- Most general measure of variablility: Subtract lowest score from highest
- Should NOT be used to reach any conclusions about how individual scores differ from one another
|
|
|
Term
|
Definition
Deviation from the standard: Average amt of variability in a set of scores/Distance from the mean -
SD = sqrt ((Sum of (x - Mean) squared)/(n-1))
Square Root of variance
- Larger SD = more spread across distribution/more different from one another & mean of distribution
- Sensitive to extreme scores
|
|
|
Term
|
Definition
The amount which scores vary or are different from each other & from the group mean
|
|
|
Term
How are SD and Variance the same? |
|
Definition
1. Both are measures of variability, dispersion, or spread
2. Formulas are similar (difference is the sqrt - Variance squared = SD) |
|
|
Term
How are SD and Variance different? |
|
Definition
SD: (sqrt of average summed squared deviation) is stated in original units from which it was derived
Variance: stated in units squared (sqrt is never taken) |
|
|
Term
|
Definition
An estimated range of values which is likely to include an unknown population parameter
- Width gives idea about how uncertain we are about the unknown parameter --> The smaller the range, the better the estimate
|
|
|
Term
|
Definition
A desired percentage of the scores (often 95% or 99%) that would fall within a certain range of confidence limits
|
|
|
Term
Which is more informative: Confidence Interval or results from a Hypothesis Test? |
|
Definition
CI More informative than simple results of the hypothesis test BC provide a range of plausible values for the unknown parameter |
|
|
Term
|
Definition
Lower and upper boundaries / values of a confidence interval and define the range of a confidence interval
Upper and lower bounds of a 95% CI are 95% confidence limits |
|
|
Term
Confidence Interval for a Mean |
|
Definition
Specifies a range of values within which the unknown population parameter (the mean) may lie
The (2 sided) CI for a mean contains all the values of 0 (the true population mean) which would not be rejected in the 2-sided hypothesis test if: H0: μ = μ0 against H1: μ not equal to μ0
Width of the CI gives us some idea about how uncertain population parameter (in this case, the mean) |
|
|
Term
Confidence Interval for the Difference b/n 2 means |
|
Definition
Specifies a range of values w/in which the difference b/n the means of the two populations may lie
The confidence interval for the difference between two means contains all the values of μ1 - μ2 (the difference between the two population means) which would not be rejected in the two-sided hypothesis test of:
H0: μ1 = μ2 against H1: μ1 not equal to μ2
i.e. H0: μ1 - μ2 = 0 against H1: μ1 - μ2 not equal to 0 -Compare one sample t-test |
|
|
Term
What does it mean if the Confidence Interval includes 0? |
|
Definition
There is no significant difference b/n the means of the 2 populations at a given level of confidence
|
|
|
Term
|
Definition
|
|
Term
Standard Deviation squared |
|
Definition
|
|
Term
|
Definition
Parameter statistic (used when generalizing about the population) |
|
|
Term
|
Definition
Mean of the whole population |
|
|
Term
x (with a bar on top) = ? |
|
Definition
|
|
Term
Describe SD's special relationship to the normal curve |
|
Definition
68% of the curve includes +/- 1 standard deviation
95% of the curve includes +/- 2 standard deviations
99% of the curve includes +/- 3 standard deviations |
|
|
Term
Why do you subtract 1 from "n" when computing the standard deviation? |
|
Definition
So you have an end value that is more generalizable to the population
-eg. df = n - 1 |
|
|
Term
How does the bell curve relate to the SD? |
|
Definition
If you get a z score of +1, then you are 1 SD above the mean IQ = mean is 100 SD = 15 SD of +1 is a z score of 1 which is equal to a score of 115
Large variance means ppl's scores are difficult to predict |
|
|
Term
These provide you with the best score for:
1. Describing a group of data
2. A measure of how diverse/different scores are from another |
|
Definition
Descriptive Statistics
1. Central Tendency
2. Variability |
|
|
Term
Skewness is about the bx of the ___. |
|
Definition
|
|
Term
|
Definition
Measure of the lack of symmetry of a distribution
Positively Skewed: Points in the (+) direction --> mean > median > mode (mean pulled in direction of extreme scores)
Negatively Skewed: Points in the (-) direction --> Mean < Median < Mode |
|
|
Term
Kurtosis: What is it? Name 4 types. |
|
Definition
How flat or peaked a distribution appears
1. LEPTOKURTOTIC: Extent to which the data is concentrated in the middle or at the tails of a distribution --> Majority of scores in the middle
2. PLATYKURTONIC: Looks flat BC scores are evenly spread out through the middle and tails; Probability of responding the same; Uniform distribution
3. BIMODAL: Mean and median fall at the same point; 2 Modes correspond to the two highest points in the distribution
4. MESOKURTOTIC (NORMAL): Mean = Median = Mode
Represented by a bell-shaped curve with most of the scores gathering in the middle and a few extreme scores pulling the tails out a bit |
|
|
Term
|
Definition
1. Nominal Variable
-Nominal Scale (level of meas't)
2. Discrete Variable
3. Ordinal variable
-Ordinal scale (level of meas't)
4. Interval scale (level of meas't)
5. Ratio scale (level of meas't) |
|
|
Term
Nominal Variable/Categorical Variable |
|
Definition
aka categorical/discrete/qualitative variable
NOMINAL SCALE: Numbers stand for names, but have no order value
-e.g. coding female = 1 and male = 2 would be a nominal scale |
|
|
Term
|
Definition
A way of measuring that ranks subjects (puts them in order) on some variable
- Differences b/n the ranks need not be equal (as they are in an interval scale)
-eg. Team standings; scores on an attitude scale, shirt sizes of small, medium, lrg, xtra lrg |
|
|
Term
|
Definition
Describes variables in such a way that the distance b/n any 2 adjacent units of meas't (or "intervals") is the same
- Scores can be added and subtracted, but NOT multiplied or divided
-eg. Fahrenheit Temperature scale: NO true 0 point, Equal distance b/n each # |
|
|
Term
Ratio Scale (or level of meas't) |
|
Definition
aka "Rank Order"
Any 2 adjoining values are the same distance apart
-eg. Height: same distance b/n 70 & 71" & 20 & 21";l 70" is twice the size of 35"
- CanNOt be made about measures on an ordinal scale (eg. the 4th tallest person is twice as tall as the 2nd tallest person)
|
|
|
Term
|
Definition
- Measure of relative location in a distribution
- Most commonly used standard score
- In SD units: Gives the distance from the mean of a particular score
Mean = 0
SD = 1
eg. z-score of 1.25 = one and a quarter SD above the mean
- Useful for measuring performance (eg. on tests)
TO FIND Z: Take your score, subtract from it the mean of all the scores, and divide the result by the SD.
z = (x - μ)/ o or z = (x - M)/SD X is your score, M is the mean, and SD is the standard deviation |
|
|
Term
|
Definition
It eliminates decimals and negative numbers
|
|
|
Term
Non-parametric Statistics |
|
Definition
aka "Distribution free stats"
Stat techniques for data that is NOT normally distributed
|
|
|
Term
|
Definition
Stat techniques for data that approximate a normal distribution
- Measurable with interval or ratio scales
|
|
|
Term
|
Definition
-"H"
- Nonparametric, one-way ANOVA for rank order data
- Based on MEDIANS, not means
- Nonparametric test of significance used when testing >2 Independent samples
- Extension of Mann-Whitney U test & of the Wilcoxon test to 3+ Independent samples
|
|
|
Term
|
Definition
Nonparametric test of statistical significance for use with ordinal data from correlated groups
- Nonparametric version of a one-way, repeated measures ANOVA
- Similar to Wilcoxon test, but can be used with more than 2 groups
- Extension of the sign test
|
|
|
Term
|
Definition
Test of the statistical significance for rank order data of differences between 2 independent grps
- Nonparametric equivalent of the t-test
- Assess whether or not the ranks of observations in one grp is the same as the ranks of observations in another group
- Similar to the Wilcoxon test
|
|
|
Term
|
Definition
aka Wilcoxon “signed-rank” or “rank-sum” test for ordinal data
Nonparametric test of statistical significance for use with 2 correlated samples and the data are rank ordered, such as the same subjects on a before-and-after measure |
|
|
Term
Kolmogorov Test
Kolmogorv Smirnov Test |
|
Definition
- Nonparametric tests (for ordinal data) with 1 grp
- Assess probability that the distribution of sample observations is likely, given hypothesized sample distribution
K-S
- Used with ordinal data for studies that involve 2 groups
- Assesses the probability that the distribution of ordered observations of one group is the same as the other group
|
|
|
Term
Review Chart for Tests of Differences b/n grps (Indep samples) |
|
Definition
#Gp Data Para Test(con't) Nonp Equ 2 Rank t-test indep samp Mann Whitn 2 Rank t-test indep samp Kolomogorov >2 Rank ANOVA/MANOVA Kruskal-Wall |
|
|
Term
Tests of Differences b/n variables (depenedent samples) |
|
Definition
#gp Data Para Test(con't) NonPara 2 Rank t-test dep sample Wilcox on Matched Pairs 2 Dichotomous(Cate) " Chi-Square >2 Rank Repeat.Meas ANOVA Friedman |
|
|
Term
Tests of relationships b/n variables |
|
Definition
#grps Data ParaTest(con't) Nonpara 2 Rank Correlation Spearman 2 Dich(cate) Chi-Square Chi-Square 2 Dich(cate) Chi-Square Chi-Square |
|
|
Term
Univariate Parametric Studies |
|
Definition
Univariate Analysis: Studying the distribution of cases of one variable only (eg. studying the ages of welfare recipiants, but not relating that variable to theis sex, ethnicity, etc.)
1. Sampling Distributions:
2. Assumptions and their violations:
a. Indep of Observations
b. Normality of distribution
c. Homogeneity of Variance |
|
|
Term
Univariate Parametric Statistics: Sampling Distributions |
|
Definition
**Univariate Parametric Statistics
- Exist for each particular statistic: Mean, variance, correlation coefficient, F test, etc
- Each has a standard error
- Created with Monte Carlo study:
1. A lrg # of equal sized random samples are drawn from a population you wish to represent
2. Stat is computed for each sample
3. Stats are arranged on a freq distribution to see a normal curve --> Done repeatedly gives you the population sampling distribution |
|
|
Term
Univariate Parametric Statistics: Normal Curve |
|
Definition
1. Bell-shaped theoretical distribution
2. Mean, median, mode are the same
3. Z-scores are used as the standard deviation unit
4. Tails of the curve are asymptotic (they never cross the x axis)
5. 68%, 95%, 99% |
|
|
Term
Normality of Distribution |
|
Definition
-
For more conservative test, use nonparametric test (Kruskal-Wallis, Mann Whitney U, Wilcoxon rank sum test) instead of ANOVA
|
|
|
Term
|
Definition
Variance w/in each condition/grp is similar to the other grps
- Within grp and b/n grp variances are similar
a. Levene: Assess the diff of scores from each of their MEANS
b. Brown Forsythe: Assess diff of scores from each of their MEDIANS
c. F Max: Biggest variance/smallest variance (if biggest is 4-10x greater, then you have a violation) -look up pg46 for ex |
|
|
Term
What tests are used to test the assumption, Homogeneity of Variance? |
|
Definition
1. Levene: Asseses diferences of scores from each of their MEAN
2. Brown Forsythe: Asseses difference of scores from each of their MEDIANS
3. F Maxm(if biggest variance is 4-10x greater = Major violation) |
|
|
Term
How can you correct for violations of homogeneity of variance? |
|
Definition
By "transforming the data" - using the sqrt transformation |
|
|
Term
____ is relatively insensitive to the presence of variance heterogeneity, except when unequal sample sizes are involved. |
|
Definition
|
|
Term
Univariate Parametric Statistics: T-test |
|
Definition
Used to test for significance b/n the means of 2 grp
-1 tailed: Directional BC tests the hypoth that the mean of 1 or the 2 grp averages is bigger
-2 tailed: Test significance of a "nondirection" hypothesis (a hypoth that says there is a difference b/n 2 averages w/out saying which of the 2 is bigger
**If RSer is uncertain which is larger, the 2-tailed test should be used
t(squared) = f, f = sqrt of t
If t is significant, F is significant
?Also used as a test statistic for correlation and regression coefficients
|
|
|
Term
What are the formulas for t in a t-test? |
|
Definition
- The one you use depends on the nature of the data and the grp being studied, usually whether the grps are indep or correlated
a. One IV
b. One DV
c. Stats: t (t2 = F)
1. Single sample
2. Independent Sample
3. Paired (related)
**Assumption: Amt of variability in each grp is equal --> May be violated if sample size is big enough
- Small samples & violation of assumption may = Ambiguous results and conclusions
|
|
|
Term
|
Definition
**t-test: Used when you have one grp to compare normative data |
|
|
Term
Used when you have one grp to compare normative data |
|
Definition
|
|
Term
Used when you compare groups (same as 1-way ANOVA) |
|
Definition
Independent Samples t-test |
|
|
Term
Used when your only IV is a repeated measure (subjects serve in every condition); same as 1-way ANOVA with 1 RM |
|
Definition
|
|
Term
WHAT KIND OF TEST IS THIS AN EXAMPLE OF?
Researchers are interested in comparing the number of antibodies following an influenza shot in the corporate world. Corporate employees were randomized to receive Mindfulness Based Stress Reduction therapy or Music therapy. They were all given a flu shot and antibody levels were tested 3 months later.
300 participants are given a questionnaire to assess their level of emotional impact from 9-11. The scores for fireman were compared with the scores from policeman participants. |
|
Definition
Independent Samples T-Test Example |
|
|
Term
WHAT IS THIS AN EXAMPLE OF?
Students are compared before they finish their statistics project and after finish their project on depression by the BDI (Beck Depression Inventory).
You are investigating the sleep efficiency of women diagnosed with nonmetastatic breast cancer. You assign a sleep efficiency score before and after a sleep hygiene class of 4 weeks. |
|
Definition
Paired (related) samples t-test |
|
|
Term
What does it mean when something is robust? |
|
Definition
How sensitive a test is to violation
Very Robust = NOT sensitive = NOT biasing the data |
|
|
Term
What are 3 assumptions you make when running an ANOVA. |
|
Definition
1. Normality of distribution
2. Independence of sampling
3. Homogeneity of variance |
|
|
Term
What is "normality of distribution"? |
|
Definition
ANOVA is NOT very sensitive to violation of normality -ANOVA is more robust to normality
- Test for a violation by eyeball - plot with histograms:
If UNIVARIATE NORMALITY: Use Z scores & charts
If BIVARIATE NORMALITY: Check through Scatter plot matrix - look for elliptical shapes
If MULTI-VARIATE NORMALITY: Assess through Mahlinobis - check for outliers
1. Tranform the data
b. Use nonparametric test |
|
|
Term
What is "Independence of sampling"? |
|
Definition
**Assumption of a t-test
- ANOVA is VERY sensitive to violations of independent sampling
- A violation of independence:
1. Design of experiment
2. Lack of random sampling
- Can NOT fix through POST-HOC
- Does NOT improve with increasing your sample size
|
|
|
Term
What is "Homogeneity of Variance"? |
|
Definition
**Assumption of ANVOVA, ANCOVA
- It IS sensitive to violations of homogeneity
- You CAN fix a violation through transformations
- Test of the statistical signifiance of the differences among the mean scores of 2+ grps on 1+ variables or factors
- Extension of a t-test (which can handle only 2 grps at a time) to a larger # of grps
- Used for assessing the statistical significance of the rel b/n categorical IV and a continuous dependent variable
- PROCEDURE (in ANOVA): Compute a ration (F ratio) of the variance b/n the grps (explained variance) to the variance within the grps (Error Variance)
|
|
|
Term
ANOVA Summary Table: What do the following mean? 1. Source 2. B/n Grps 3. W/in Grps 4. SS 5. Df 6. MS 7. F |
|
Definition
1. Source of the variance
2. Explained Variance
3. Error Variance/Unexplained Variance
4. Sum of squares (Total of squared deviation scores)
5. Degrees of Freedom
6. Mean Squares: Calculated - SS/Df
7. F: Ratio of MS b/n to the MS w/in |
|
|
Term
Univariate Parametric Statistics: ANOVA |
|
Definition
- Logic: Assesses the differences b/n grp means; Involves partitioning the total variance into components (e.g. within grp and b/n grp)
- Partitioning the Variance: Dividing up the different sources of variance into sums of squares
- Sums of Squares: A mz of variability around the mean of the distribution (SST = SSbg +SSwg)
- Sum of Squares converted into Mean Squares:
MST = SST/df
MSB = SSB/df
MSW = SSW/df
-An estimate of the variance b/n Tx grps
-Reflects the variability due to differences b/n grp means
-Grand mean: Represents the variability due to error + the effects of the IV
- Mean Square WG (Error Term):
-An estimate of the pooled variance within Tx grps
-Reflects the variability among subjects that are treated alike
-Only reflects variability attributed to error
- Omnibus F = Overall F
- F = MSB/MSW
- F = (Error + Tx)/Error
|
|
|
Term
F Ratio - Why is it important? |
|
Definition
ANOVA formula (ratio) compares the amt of variability b/n the groups (which is due to chance)
- If Ratio = 1, the amt of variability due to W/in grp diff = amt of variability due to b/n grp differences
--> No sig of diff b/n the grps
- As average diff b/n grps gets larger (Numerator increases in value), F value increases
--> As F value increases, it becomes more extreme in relation to the distribution of all F values and is more likely due to something other than chance |
|
|
Term
What is the relationship b/n the t value and the F value? |
|
Definition
F = t2
(F value for 2 grps = t value for 2 grps squared)
- t values (always used for the test b/n the difference of the means for 2 grps) and an F value (which is always more than two groups) might be related
|
|
|
Term
What are the assumptions of ANOVA?
What tests can you use to test for HOV?
What tests do you use to correct for this violatioin? |
|
Definition
1. Independence of scores:
-Each observation is NOT related to the other
-Achieved through RA
-NO way to correct for violation of this
2. Normal Distribution (normality):
-Population is normally distributed
3. Homogeneity of Variance:
-Inflates Type I Error
4. F max
-Test for HOV
-Largest variance/Smallest variance
-Want <9
5. Test for HOV:
-Brown-Forsyth
-Bartlett test
-Hartly
-Cochran
6. Correction for this violation:
-Welch W test
-Brown-Forsyth F* test |
|
|
Term
|
Definition
2+ IVs (2-way, 3-way., etc.)
- SSB = SS(A) + SS(B) +SS(AxB)
-SSA = Sum of squares for Factor A
-SSB = Sum of squares for Factor B
=SSA x B = Sum of squares for AxB interaction
- 3 x 2: 3 levels of one grouping factor, 3 levels of another
--> 6 different possibilities: X Axis = IV1 -Separate Lines = IV2; Y Axis = DV |
|
|
Term
|
Definition
An interaction is present when:
1. Simple Effects of one IV are NOT the same at all levels of the second IV
2. When one of the IVs does not have a constant effect at all levels of the other IV
|
|
|
Term
When do you use an ANOVA? |
|
Definition
When you are looking for difference b/n 2+ grps
- ONLY data you can use: 1. Interval 2. Ratio
- A within grp variable = Repeated Measure
|
|
|
Term
Nomenclature for ANOVAs: Formula for determining how many ways it is |
|
Definition
(# of IVs)-Way ANOVA with (# of within or b/n grp IVs) Repeated Measures |
|
|
Term
What does repeated measures mean for Factorial ANOVA? |
|
Definition
Subjects participate in EVERY condition
-eg. Use pretest, posttest, F/U: Effects of yoga on sleep IV#1 = Tx (yoga v. psychoeducation) IV#2 = Time (pretx v. posttx v. F/u) RM DV = sleep inventory score
**2 WAY ANOVA w/1 RM |
|
|
Term
What does an ANOVA assess for? |
|
Definition
Group differences by comparing the means of each group
Involves spreading out the variance into different sources |
|
|
Term
What is the Sum of Squares? |
|
Definition
Measures variability
- Sum of the squared deviation of scores from the mean
-SS1: Sum of squared deviations of group means from grand mean for IV1
-SS2: Sum of squared deviations of group means from the grand mean (IV2) SS (1x2): Sum of Squares for the intx b/n IV1 & IV2
-Result of adding together the squares of deviation scores
To Calculate:
1. Subtract average of scores from each score
2. Square each answer
3. Add up answers B/N GRPS SS: Sum of squared deviations of GROUPS MEAN from GRAND MEAN
-WITHIN GRPS SS: Sum of squared deviations of INDIVIDUAL SCORES from GROUP MEAN
-TOTAL SS: SS b/n + SS w/in |
|
|
Term
What is the B/n Sum of Squares? |
|
Definition
A measure of b/n grp differences:
Used to compare within-grp differences to compute the F ratio in an ANOVA
To Calculate:
1. Square the deviation scores (each score divided by the mean)
2. Add them up
|
|
|
Term
Define:
1. Factorial Design
2. Simple Effects of IV
3. Interaction of IV
4. Main effect of IV |
|
Definition
1. Factorial Design: Consists of a set of single-factor designs in which the same IV is manipulated but in combination with a second IV
2. Simple Effects of IV: Difference associated with the single-factor experiment involving factor A at level b1
3. Interaction of IV: Present when we find that the simple effects associated with one IV are not the same at all levels of the other IV
4. Main Effects: Overall or average effects of the variable, obtained by combining the entire set of component experiments involving that factor |
|
|
Term
How do you calculate the df for an ANOVA? |
|
Definition
a = grps; n = subjects
B/n Grps: a-1
W/in Grps: a(n-1) |
|
|
Term
How do you calculate the MS for an ANOVA? |
|
Definition
"purer" form of SS
B/n Grps: SS/Df ----> "Tx Effect"
W/in Grps: SS/Df ---> "Error Term" |
|
|
Term
|
Definition
Statistic that tells you if the groups are different
(MS b/n) / (MS w/in): "Tx effect + error/error"
- If F = 1, Null hypothesis is TRUE (error/error)
-If F > 1, Null hypothesis is FALSE (Tx effect + error/error) |
|
|
Term
How do you calculate Power? When do you have good power? |
|
Definition
(n(magnitude of effect)/within grp variance)
1. Large sample size
2. Large effect size
3. Low within grp variance --> Usually want .80 power? |
|
|
Term
What do you check if the Omnibus F is significant? |
|
Definition
The Magnitude of Effect: To see how big the differences (Tx Effect) are
- Magnitude of effect = Effect Size
- Calculated using R2 or omega2
-Decimal format (example = .60 means that 60% of the variance in the DV is accounted for by the IV) |
|
|
Term
Give 3 examples of Planned Comparisons for ANOVA |
|
Definition
1. Simple (pair-wise) comparison:
-Comparison b/n 2 grps
2. Complex comparison:
-Comparisons b/n an average of 2+ grps compared to a single grp
3. Orthogonal Comparisons:
-Reflect independent or nonoverlapping pieces of info
-The outcome of one comparison gives no indication about the outcome of another orthogonal comparison
-Coefficient: Weights for the means
-Y (psi): Difference b/n 2 means
|
|
|
Term
If there is no significant interaction in an ANOVA, what do you do? |
|
Definition
1. Check MAIN EFFECTS: To see if there are differences across levels of one IV
-If no ME, stop analyzing and check power (Power should be around .80)
2. If there is a SIGNIFICANT INTERACTION, look at SIMPLE EFFECTS: How one level of an IV varies across every levle of another
3. Test SE within AxB interaction -
-> Compare As (A @ B1, A @ B2) & Bs (B @ A1, B @ A2)
4. If YES SE: Test for SIMPLE COMPARISONS (only when levels of a variable are >2): A1 v. A2 @ B2 v. A3 @ B1, etc. (Comparisons made within one variable) |
|
|
Term
When do you run a planned comparison? |
|
Definition
ONLY if the Omnibus F is significant!
- BC: Know that the grps are significant, but don't know where the differences are --> Test for differences b/n levels of your IVs
|
|
|
Term
What is familywise or Experimentwise Error? |
|
Definition
"Family" means grp or set of "related" statistical tests: The probability that a Type I error has been committed in RS involving multiple comparisons
**If alpha =.05 & you make 3 comparisons of the same data, FWE = .15; Could lower Alpha to .01, but this increases probability of TII Error
--> Alternative is to use:
a. Scheffe Test: Adjusts alpha level for all possible comparisons
b. Bonferroni: Strict, Stringent, conservative
c. Sidak-Bonferroni: Not as stringent, more TI, less TII compared with Bonferroni |
|
|
Term
If you found that there is an overall difference among the means in an ANOVA, what do you do? |
|
Definition
-Important to control for TI Error for each comparison - (# grps -1): Shows you pairwise differences
-The more comparisons you run, the greater the chance for TI Error
--> FWise/ExperWise Error is HIGH
a. Scheffe:
-Adjusts alpha level for all possible comparisons
-Most stringent
b. Tukey:
-Test differences b/n all possible pairs of means
c. Fisher Hayter: Same as Tukey but uses (a-1)
d. Dunnet:
-Pairwise comparisons using a single group
-eg. Only Tx v. control
e. LSD:
-Least Stringent
**Degree of conservativeness of the correction BC of the likelihood of making an error:
Type I Error <----------> Type II Error
Scheffe > Tukey > FisherHayter > Dunnett > LSD |
|
|
Term
What is the general design of a RM ANOVA?
What are the advantages of a RM ANOVA?
The disadvantages?
The assumptions? |
|
Definition
- DESIGN: Doesn't matter how many IVs --> # of IVs in a grp matters (do subjects vary in every condition?
-Time = Most common w/in grp IV
1. Can use a smaller sample size BC have more control over subjects' variability
2. Comparing ea person's score against their previous score, not someone else's
1. Practice Effects: Subjects show improvement over time or become bored/fatigued
2. Carryover Effects: Performance on one measure impacts the next (Counterbalance takes care of this problem)
1. Same as ANOVA
2. Sphericity: Everyone stays in their relative rank (if you were the most anx on the measure, you will be the most anx in every other condition as everyone fluctuates)
-->Greenhouse Geisser tests for this assumption; If violated, look at Huydt Feldt values BC they correct slightly for violations |
|
|
Term
|
Definition
- Extension of ANOVA that provides a way of statistically controlling the linear effects of variables (covariates, control variables) one does not want to examine in a study
- Reduces experimental error by statistical means
- Subjects 1st measured on the covariate, then randomly assigned to grps without regard for their scores on the covariate
- Covariate should be correlated with the DV, but not with any of the IVs
- Scores on the covariate are used to:
1. Adjust estimates of experimental error
2. Adjust Tx Effect for any differences b/n the Tx grps that existed prior to the experimental Tx
3. Uses Linear Regression to remove covariates from the list of possible explanations of variance on the dependent variable, rather than direct experimental methods to control extraneous variables
4. Used when pretest scores are covariates in pre/posttest exper designs
5. Used in nonexper RS - Surveys, nonrandom samples, quasi-exper designs when RA is not possible |
|
|
Term
What are the assumptions of ANCOVA? |
|
Definition
1. Independence of Observation
2. Normality: Examine scatterplot matrices (bi-variate) 3. Homogeneity of Variance: Levene's & Brown Forsythe
4. Linearity: visual examination of scatter plots
5. Homogeneity of Regression: Slope of the regression line (beta) is assumed to be the same for each group, condition, cell |
|
|
Term
|
Definition
Statistic (Cohen's d, D, delta) indicating the difference b/n grps, Tx, or conditions
- Magnitude of the difference b/n 2+ conditions expressed in standard deviation units: Association of strength or relation (Pearson'r r, eta)
- To Calculate: ES = (m1-m2)/s
1. Take difference b/n the control and experimental grps' means
2. Divide that difference by the standard deviation of the control grps's scores - or by standard deviations of the scores of both grps combined |
|
|
Term
|
Definition
- MOE is the size of the tx effect
-The proportion of the total variability in the experiment associated with the experimental tx
-Not effected by sample size: Small effect = .01, Large = .15
- R2 = SSA/SST
-R2 will always be larger than omega: Does not take error into account |
|
|
Term
|
Definition
- Mz of the relative rand ordering of 2 sets of variables
- Mean cross-product of all z-scores
- Tells you the direction and the strength of the relationship
- Ranges b/n –1 and +1; 0=no relationship
- When perfect correlation, the scores for all the subjects in the X distribution have the same relative positions as corresponding scores on the Y distribution
- Correlation will only by high if you have a full range of scores; if not, your correlation will be attenuated (restriction of range)
|
|
|
Term
1. Semi Partial Correlation
2. Partial Correlation |
|
Definition
1. Semi-Partial: corr b/n a specific predictor (x) and the criterion (y) when all other predictor in the sfudy have been partialed out of X but not of Y
2. Partial:corr b/n a specific predictor (x) and the criterion (y) when all other predictors have been partialed out of X and Y |
|
|
Term
Spearman Rank Correlation |
|
Definition
Non-parametric alternative to Pearson Product-moment correlation
- Used to determine the relationship b/n two rank ordered variables (Measured on an ordinal sclae)
-eg. rel b/n knowledge of the political system and self-esteem
- Rank on 2 scales, Spearman's rho measures the association b/n the 2 sets of ranks
-Null hypothesis is that the 2 ranks are independent |
|
|
Term
Correlational Technique: Scatter plot |
|
Definition
The relationship between 2 variables is represented graphically
X variable on (abscissa) horizontal axis and Y on the vertical (ordinate) axis |
|
|
Term
Significance of Correlation |
|
Definition
-Has nothing to do with the magnitude of the correlation (i.e., just because a correlation is sig at the .001 level does not mean that it is a strong relationship, only that there is a 1% probability that the correlation occurred by chance)
- P value only tells you whether you can interpret the relationship
- Sample size effects p values- NOT the magnitude of the correlation
|
|
|
Term
Assumptions of Correlation |
|
Definition
1. Linearity
2. Bi-variate Normality
3. Full range of scores (no restriction of range) |
|
|
Term
|
Definition
- Used to determine the utility of a set of predictor variables for predicting an event or behavior (criterion variable)
- MR yields a weighted linear combination of predictors that provides the best prediction of the criterion
- Line of best fit: aka "regression line"
-There is always one straight line that has a smaller sum of squared deviations than any other straight line.
-Regression equation plots a line through the data points that minimizes the residuals (errors)
- Diff b/n the predicted Y score and the actual Y score
-Mean of the residuals always equals 0 |
|
|
Term
Multiple regression Simple regression equation |
|
Definition
- Y’ = a + bx Y’ = predicted Y
- a = y intercept (the constant)
- Indicates the criterion score when all of the predictors equal 0
-The Y value at which the line touches the vertical Y axis
-Indicates the effects of the predictor on the criterion
-The expected change in Y for each 1 unit change in X
- x = a score on the x variable
|
|
|
Term
|
Definition
A form of MR that examines the contributions of all of the predictors at the same time, rather than by adding or subtracting variables one at a time. |
|
|
Term
|
Definition
A form of MR that consists of a series of steps in which predictor variables are added (forward inclusion) or subtracted (backward inclusion)
-The difference between R2 at each step
-Equals the squared semi-partial correlation coefficient b/n the criterion and the additional predictors that were included in that particular step |
|
|
Term
Hierarchical Regression (theory driven) |
|
Definition
Similar to Stepwise, however the decision about the order in which variables are determined by the experimenter. |
|
|
Term
|
Definition
The multiple correlation coefficient
- On a scale from 0 (no relationship) to 1 (perfect prediction), indicating the degree of linear relationship btw the criterion and the combo of predictors.
|
|
|
Term
|
Definition
The proportion of the total variability in the experiment associated with the experimental tx Not effected by sample size Small effect = .01, large = .15 |
|
|
Term
|
Definition
Symbol for a coefficient of multiple determination b/n a DV & 2+ DVs
Used measure of the goodness-of-fit of a linear model -Sometimes written “R-squared.” |
|
|
Term
|
Definition
A measure of how much the variance in a DV (measured at the interval level) can be explained by a categorical (nominal, discrete) IV
May be used as a measure of association b/n 2 interval variables
Estimate of the variance associated with all the IVs taken together
Eta squared in ANOVA is analogous to R2 in multiple regression |
|
|
Term
What does the Magnitude of Effect (effect size) tell us? |
|
Definition
How big the Tx effect (differences) are: • Small = .01 1% of the variance in scores is due to the treatment • Medium = .06 6% of the variance in scores is due to the treatment • Large = .15 15% of the variance in scores is due to the treatment
• Calculated as R2 or omega2 [v 2 ] with the former always being a little larger because it does not take error into account. • Effect size is the proportion of total variability that is due to the treatment! • Not affected by sample size. |
|
|
Term
What does it mean that there is a Correlation? |
|
Definition
-Possible for 2 variable to be related (correlated), but not have one variable cause another -Does not explain why something is the way it is |
|
|
Term
Description of Correlation |
|
Definition
Relationships b/n variables -Ranges from -1 to 1 with 0 meaning the variables are unrelated -Negative v. Positive tells us the direction of the relationship a. Negative = One variable increases, while the other decreases b. Positive = Variables increase at the same time or decrease at the same time
-Closeness to -1 or 1 tells us the strength of the relationship: Farther from 0 = stronger -Significance level tells us how sure we are that the relationship is real |
|
|
Term
Assumptions for Correlation |
|
Definition
1. Linearity: Data is linear, not curvilinear 2. Normality: normal distribution curve for that sample ---> NOT bimodal, kurtotic (leptokurtotic v. platykurtotic), or skewed (positive or negative)
3. Full Range of Scores: Restriction of range will cause your correlation to devrease BC you did not represent the full distribution of the population from which you sampled |
|
|
Term
|
Definition
-Shows the degree of linear rel b/n 2 variables that have been measured on interval or ratio scales(eg. rel b/n height and weight) Mz of the relative rand ordering of 2 sets of variables Mean cross-product of all z-scores Tells you the direction and the strength of the relationship Ranges bw –1 and +1 0=no relationship When perfect correlation, the scores for all the subjects in the X distribution have the same relative positions as corresponding scores on the Y distribution Correlation will only by high if you have a full range of scores, if not, your correlation will be attenuated (restriction of range) |
|
|
Term
|
Definition
-Correlation that partials out (Controls for) a variable, but only from one of the other variables being correlated -Rel b/n the predictor and outcome when all other predictors are partialled out of only the original predictor 1. Gives unique contribution of the predictor to the outcome 2. Pure rel b/n predictor and outcome 3. Smaller rel than the Partial gives us BC we are leaving all the variance of the outcome tact (CONTRAST Partial Corr: Took variance out of both the predictor and outcome) |
|
|
Term
|
Definition
-R = Multiple Correlation Coefficient; R2 = Coefficient of Determination = proportion of variance in the DV that can be explained by the axn of all the IVs taken together -Correlation with >2 variable , one which is dependent, the others independent -Goal: Measure the combined influence of 2+ IVs on a DV |
|
|
Term
A researcher is convinced that men are faster runners than women. The researcher takes place holdings of competitors in a recent marathon and wants to see if there was a strong relationship between gender and placement in the race. What statistic should the researcher use? |
|
Definition
Spearman Rank Correlation |
|
|
Term
Point Biserial Correlation |
|
Definition
A type of correlation to measure the association b/n 2 variables, one of which is dichotomous (category) and the other continuous (score) |
|
|
Term
We are interested in the relationship between clinical depression and coping among head injury patients. Head injury patients take the BDI and those scoring above 12 indicate clinical depression and lower scores indicate the patient is not depressed (categorical data!). The avoidant subscale (scores) is also used from the Coping Inventory. How do we analyze the relationship between depression diagnosis and avoidant coping? If a researcher decided to group data from a satiated scale as full or not full (category) and used a mood inventory (score), what statistic should be used to explore their relationship to one another? |
|
Definition
Point Biserial Correlation |
|
|
Term
|
Definition
- A type of correlation or measure of association between two variables used when both are categorical and one or both are dichotomous
- Phi is a symmetric measure
- It is based on the chi-squared statistic (specifically, to get phi you divide chi-squared by the sample size and take the square root of the result)
- Relationship b/n 2 variables when one is nominal and one is interval/ratio
|
|
|
Term
|
Definition
aka "Scatter diagram" or "Scattergram"
|
|
|
Term
Significance of a Correlation |
|
Definition
Indicates the probability of a true non-zero relationship
p values only tell you whether you can interpret the relationship
Sample size affects p values, NOT the magnitude of the correlation (just BC a corr is sign at the .001 level does NOT mean it is a strong relationship - it means there is a 1% probability that the correlation occurred by chance |
|
|
Term
Assumptions of Correlations |
|
Definition
1. Linearity
2. Bi-variate Normality
3. Full range of scores (no restriction of range) |
|
|
Term
|
Definition
Any of several methods for examining multiple (three or more) variables at the same time:
Usually 2+ IVs and 1 DV
1. Stricter usage reserves the term for designs with 2+ IVs AND 2+ DVs
2. Applies to designs with more than one IV and more than one DV
--> Allows RSer to examine the relation b/n 2 variables while simultaneously controlling for how each of these may be influenced by other variables |
|
|
Term
Multiple Regression and Path Analysis --> Multiple Regression Analysis (MRA) |
|
Definition
Any of several related statistical methods for evaluating the effects of more than one IV (or predictor) variable on a dependent (or outcome) variable
Answers 2 main questions:
1. What is the effect (as measured by a regression coefficient) on a DV of a one-unit change in an IV, while controlling for the effects of all other IVs
2. What is the total effect (as measured by R2) on the DV of all the IVs taken together |
|
|
Term
What is Multiple Regression? |
|
Definition
Used to determine the utility of a set of predictor variables for predicting an event or bx (criterion variable)
DV = Criterion IV = Predictor
- MR yields a weighted linear combination of predictors that provide the best prediction of the criterion
|
|
|
Term
|
Definition
A form of MR that examines the contributions of all the predictors at the same time, rather than by adding or subtracting variables one at a time |
|
|
Term
|
Definition
A form of MR that consists of a series of steps in which predictor variables are added (forward inclusion) or subtracted (backward inclusion)
Variables are selected and eliminated until there are none left that meet the criteria for removal
-Equals the squared semi-partial correlation coefficient b/n the criterion and the additional predictors that were included in that particular step
|
|
|
Term
MR: Hierarchical Regression |
|
Definition
Similar to stepwise, however the decision about the order in which the variables are determined by the experimenter
Hierarchy (order of the variables) is determined by the RSer in advance, based on her understanding of the relations among the variables |
|
|
Term
MR Basics: Line of Best Fit |
|
Definition
aka "Least Sum of Squares"
The line that fits best through the data, minimizing the distance b/n all the pts and itself |
|
|
Term
|
Definition
Graphic representation of a regression equation
Line through the pts that best summarizes the relationship b/n the DVs and IVs: Computed by using least ordinary squares cirterion |
|
|
Term
MR: Simple Regression Equation |
|
Definition
aka "prediction equation": Y = a + bX + e; Y = DV
X = IV; b = Slope or regression coefficient; a = intercept; e = error term
Regression equation plots a line through the data pts that minimizes residuals (errors) |
|
|
Term
MR: Basic Design & Stats Used |
|
Definition
Used to predict an event or bx
- No IV -> 2 DVs (continuous score/interval data)
- DVs can be split up into one outcome (Criterion) and the rest are predictors
Stats Used:
1. Line of Best Fit
2. Residuals
3. Weights
4. R
5. R2 |
|
|
Term
Multiple Regression: Advantages and Disadvantages |
|
Definition
ADVANATAGES:
1. Statistics Control: Allows you to partial out (hold constant) all other predictors so you can focus on the unique contribution of each separately
2. Residuals aka "Error": Portion of the score on a DV not explained by IVs; Difference b/n the value observed and the value predicted by a model; "Error"; Degree of inaccuracy
3. Residual SS aka "Error SS":
-Sum of Squares not explained by the regression equation; Analogous to w/in grps SS in ANOVA
4. Multiple Regression Equation:
-Raw Weights: Y' = a + bx1 + bx2...
-Standardized Weights: Y' = Bx1 + Bx2... --> When using standardized Betas, the a intercept ALWAYS = 0 |
|
|
Term
|
Definition
A kind of multivariate analysis in which causal relations among several variables are represented by graphs (path diagram) showing the "paths" along which causal inferences travel
- Path coefficients: Computer-calculated
-Provide estimates of the strength of the rels in the RS's hypothesized causal system
-Use data to examine the accuracy of causal models
**ADVANTAGE: RSer can calculate direct and indirect effects of IVs |
|
|
Term
When do you use Structural Equation Modeling? |
|
Definition
Use when we want to identify causal rels b/n variables so you must first draw a path connecting all your variables (extension of MR that allows RSer to test a theory of causal ordering among variables)
-Most complex version of path analysis |
|
|
Term
'Relationships' in Path Analysis |
|
Definition
1. Direct: Variable that indicates causality directly to the DV; Uses Betas as the path coefficient
2. Indirect: Includes more than one variable as influencing causation to the DV - still uses path coefficient/aka betas
3. Spurious: Pathways are ones in which the path reverses in a linear fashion
4. Unanalyzed: Paths that we cannot conclude the direction of causality but are testing for a relationship as indicated by "r"
5. Moderators
6. Mediators |
|
|
Term
|
Definition
2 variables have a causal relationship but another variable can change that relationship
eg. Stress (A) --> Decline Immunity Fxning (C) -But, Social Support (B) may alleviate (C) |
|
|
Term
|
Definition
When a variable must be added in order for 2 variables to be causally related A -> B -> C (B is mediator)
eg. Cell phone (A) -> Radiation (B) -> Tumor (C) |
|
|
Term
|
Definition
1. Must be measured on interval/ordinal scale (no categories or ranks)
2. Endogeneous Variables: Variable explained by another variable (C, D, E - pg 65) --> The ones we are trying to explain!
3. Exogeneous Variables: Variables that cause others, but are explained by any of the others - "Predictor Variables" |
|
|
Term
Path Analysis: Statististics Used |
|
Definition
-Run a Multiple Regression and report Betas and Semipartials (these values are called "path coefficients") & indicate the amt of influence and unique contribution of the causal variable, respectively
-# of multiple Regressions run = # of Endogeneous variables |
|
|
Term
What are Path coefficients for Path Analysis? |
|
Definition
1. Betas: Standardized - can compare them
2. Semi-partial: Actually indicates the % of variance uniquely accounted for by the variable with all others held constant |
|
|
Term
What is a goodness of fit WRT Path Diagrams? |
|
Definition
Results in a chi-square and several programs run the entire model to indicate how well the data fits in the model
-DO NOT want significance
-WANT small chi-square |
|
|
Term
What is the Reproduced Correlation Matrix? |
|
Definition
A way to check whether your data fits the model
-Advantages of Regression within path analysis = allows us to hold certain paths constant in order to analyze one path at a time |
|
|
Term
Multiple Regression: "R" Assumptions |
|
Definition
1. Linearity
2. Bi-variate Normality
3. Full range of scores (no restriction of range) |
|
|
Term
Multiple Regression: "R2" |
|
Definition
-Coefficient of determination
-The square of multiple correlation coefficient that indicates the proportion of variance in the criterion that is shared by the combo of predictors
-The proportion of the variance of the DV “explained” by the IVs (R2 = Variance of Y explained by the IVs / Total variance of Y) |
|
|
Term
|
Definition
Another term for intervening variable, that is, a variable that “transmits” the effects of another variable
- Indicate causation—compare both A (IV) ------- C (DV)
- Yields a path coefficient that may be significant
- If you plug a B (IV) and that Beta is significant then we can assume that B (IV) is a mediator in the model and shows causation regarding the DV Relationship between A and C or the amount of variance would go down .90 A ----------- C .20 A ----------- C A--B--- C
|
|
|
Term
|
Definition
A variable that influences (“moderates”) the relation between two other variables and thus produces a moderating effect or an interaction effect
- Influences the relationship between two variables -Alters or adjusts the relationship (cannot cause the relationship) but rather changes the way that A interacts w/ C
- The idea that a variable interacts w/ some other variable to influence the DV but is NOT directly causal (serves as a buffer)
|
|
|
Term
What is the difference b/n R and R2? |
|
Definition
Anytime you square an “r” or “R” it will give you the variance shared between the variables
Just a plain r gives you a correlation whereas squaring it gives you a variance |
|
|
Term
|
Definition
The degree to which a research finding is meaningful or important
Without qualification, the term usually means statistical significance, but lack of specificity leads to confusion (or allows obfuscation) |
|
|
Term
|
Definition
The risk associated with not being 100% confident that what you observe in an experiment is due to the treatment or what is being tested
If you read that significant findings occurred at the .05 level (or p < .05), the translation is that there is 1 chance in 20 (or .05 or 5%) that any differences found were not due to the hypothesize reason (whether mom works) but to some other, unknown reason(s) |
|
|
Term
|
Definition
is the degree of risk you are willing to take that you will reject a null hypothesis when it is actually true |
|
|
Term
Threats to finding significant differences |
|
Definition
1. Low Power (Due to small sample size and effect size and large error): want at least .80.
-Power = Sample Size(Effect Size) / Variance -->if you have a lot of within group variance (error) OR low sample size OR low effect size, power will be a problem.
-->Power is the SENSITIVITY of an experiment to find real differences between groups!!!
2. Subject Heterogeneity: When subjects are very different, you will see a lower effect size (tells us how big the difference is!) and subsequently decreased power
3. Unreliable Measures
4. Multiple Comparisons: Making numerous comparisons causes family wise error --> This type of error is based on the assumption that the more comparisons the greater chance for type I error (alpha of .05 says for every 100 comparisons 5 will be significant by chance alone) |
|
|
Term
Beta Weight aka “regression weights" |
|
Definition
Another term for standardized regression coefficients, or beta coefficients
- Beta weights enable researchers to compare the size of the influence of independent variables measured using different metrics or scales of measurement
- b raw: b weight is the unstandardized weight and gives us the amount of influence of that predictor
-CANNOT compare unstandardized beta weights across samples
- B Standardized: Uses z-scores
-These weights indicate the amount of influence each predictor has on the outcome
-Whichever one has a larger Beta weight can be considered a more valuable predictor |
|
|
Term
Incremental Variance and Significance |
|
Definition
The amount of variance in the criterion that a predictor explains, above and beyond the other predictors in the analysis
• Change in R2
• Best when each predictor correlates highly with the criterion but not with other predictors |
|
|
Term
Assumptions of Multiple Regression |
|
Definition
1. Independence of Observations: All scores are independent of each other
2. Normality: Can be corrected by transforming scores
3. Linearity: Multivariate normality and linearity is assessed using a scatter plot matrix; Make sure all blocks are relatively elliptical in shape.
4. Homoscedasticity: Evenness of Errors
-Parametric statistical tests usually assume homoscedasticity --> If that assumption is violated, results of those tests will be doubtful validity.
5. Homogeneity of variances: A condition of substantially equal variances in the dependent variable for the same values of the independent variable in the different populations being sampled and compared in a regression analysis or an ANOVA. |
|
|
Term
Independence of Errors: Error Score Assumptions |
|
Definition
1. They have a mean of zero
2. They are uncorrelated with each other
3. They have equal variances at all values of the predictor (e.g., homoscedastic)
4. They are normally distributed |
|
|
Term
Independence of Errors: Specification Errors |
|
Definition
1. The relationship between variables must be linear
2. All relevant predictors must be included
3. No irrelevant predictors can be included |
|
|
Term
|
Definition
The tendency for the strength and accuracy of a prediction in a regression or correlation study to decrease in subsequent studies with new data sets
**Has to do with scores regressing toward the mean on retesting
- MRC derives a prediction equation from the Derivation Sample (the original sample that the regression equation is derived from)
- R2 is a maximizing procedure that yields an inflated estimate, because it takes advantage of sample specific error.
- Adjusted R2: A more accurate estimate of prediction
|
|
|
Term
|
Definition
- MRC is a correlational technique that does not imply causality.
- Path analysis is an extension of MRC, which allows the researcher to test a theory of causal ordering among a set of variables
- Variables must be measured on interval or ordinal scale
- The number of regressions that need to be run = the number of endogenous variables in the model.
The number of cases required depends on the model’s complexity: Most require about 200-300 cases.
|
|
|
Term
|
Definition
1. Linearity – Straight line btw variables
2. Homoscedasticity: Residuals have equal variance at all values of the predictors
3. Normality: Can be corrected by transforming scores
4. Independence of Errors: Hardest to correct – the worst assumption to violate
5. Error Score Assumptions
a. Mean of zero
b. Uncorrelated with each other
c. Equal variances at all values of the predictors (I.e. homoscedastic)
d. Normally distributed
6. Specification errors:
a. The relationship btw variables must be linear
b. All relevant predictors must be included
c. No irrelevant predictors can be included
7. Mz errors: Measures should be RELIABLE and VALID |
|
|
Term
|
Definition
A causal model in which all the causal influences are assumed to work in one direction only, that is, they are asymmetric (and the error or disturbance terms are not correlated across equations)
-eg. A -> B |
|
|
Term
|
Definition
A causal model in which all the causal influences are assumed to work in two directions (and the error or disturbance terms are not correlated across equations) |
|
|
Term
|
Definition
A variable that is caused by other variables in a causal system |
|
|
Term
|
Definition
aka “prior variables"
-If a variable does not have an arrow point at it, it is exogenous. |
|
|
Term
|
Definition
In the path diagram, direct effects are indicated by straight arrows from one variable to another |
|
|
Term
|
Definition
A numerical representation of the strength of the relations b/n pairs of variables in a path analysis when all the other variables are held constant.
Standardized regression coefficients (beta weights): Regression coefficients expressed as z-scores
Unstandardized path coefficients are usually called path regression coefficients |
|
|
Term
|
Definition
The product of two direct effects
The total causal impact of a variable on the criterion is the sum of the direct effects and the product of the indirect effects |
|
|
Term
|
Definition
-When two variables have a common cause
-Represented by a path that goes against the direction of the arrows in the model |
|
|
Term
|
Definition
aka "colinearlity"
When two or more independent variables are highly correlated --> makes it difficult if not impossible to determine their separate effects on the dependent variable
When there is a lot of overlap between predictors (e.g., predictors are redundant) |
|
|
Term
|
Definition
**Test of multicollinearity**
The proportion of a predictor’s variance that is not shared by the other predictors
It should be as close to one as possible. |
|
|
Term
Standard Error of Estimate |
|
Definition
The “estimate” is a regression line
The “error” is how much you are off when using the regression line to predict particular scores
The “standard error” is the standard deviation of variability of the errors
- It measures the average error over the entire scatter plot
- The larger the SEE, the less confidence once can put in the estimate
- Symbolized syx to distinguish it from (i.e., the standard deviation of scores - not the error scores)
- The standard deviation of the distribution of error
- Estimate of how far the average score varies from the regression line
- Can be used to calculate a confidence interval around the regression line
|
|
|
Term
Statistical Control in MRC |
|
Definition
Using statistical techniques to isolate or “subtract” variance in the dependent variable attributable to variables that are not the subject of study
1. Partialling
2. Holding Constant
3. Covarying |
|
|
Term
|
Definition
- Subject or other unit of analysis that has extreme values on a variable
- Important BC they can distort the interpretation of data or make misleading a statistic that summarizes values (such as a mean)
- May also indicate that a sampling error has occurred by including a case from a population different from the target population
- Some decision needs to be made as to what to do with outlier :
1. Delete case
2. Substitute mean score
3. Transform score |
|
|
Term
What are 4 different statistics you can run to determine if there are outliers? |
|
Definition
1. Mahalanobi’s: gives you a number for each subject telling you how far away they are from the theoretical center of data
2. Leverage: gives you a number for each subject telling you how far they are from all other subjects
3. DfBeta: gives you a number for each subject on every predictor telling you how much that subject influenced each predictor’s regression weight
4. Cook’s: gives you a number for each subject telling you how much they influenced the regression equation as a whole |
|
|
Term
|
Definition
Enables RSers to reduce a large number of variables to a smaller number of variables, or factors, or latent variables
- Purpose: simplify the description of data by reducing the number of necessary variables, or dimensions
- A factor is a set of variables, such as items on a survey that can be conceptually and statistically related or grouped together
- Factor analysis is done by finding patterns among the variations in the values of several variables; a cluster of high intercorrelated variables is a factor
- Exploratory factor analysis was the original type
- Confirmatory factor analysis developed later and is generally considered more theoretically advanced
- Components analysis is sometimes regarded as a form of factor analysis, though the mathematical models on which they are based are different: While each method has strong advocates, the two techniques tend to produce similar results, especially when the number of variables is large.
|
|
|
Term
Factor Analysis: Basic Design |
|
Definition
This is a technique in which a large number of interrelated variables are reduced into a smaller number of latent dimensions 1. No IVs 2. More than 2 DVs that cannot be divided into predictors and outcomes 3. Communalities: value for each variable telling you how much of the variable was used by the components/factors/subscales • Low communalities = did not load highly on any factor and can be thrown out • High communality = high loadings on one or more factors Must have 5-10 people per item! The more heterogeneous the sample, the more factors will emerge 4. Factor Loadings: ender each factor, every variable gets loading that indicates how important that variable is (correlation between item and factor) |
|
|
Term
Difference b/n Factor Analysis & Principle Components? |
|
Definition
- Principal Components is used when we want 100% of the variance between items explained
- Factor Analysis only explains shared/common variance between the variables used (Communalities can be >1 in FA but must equal 1 in Principal Components)
|
|
|
Term
Rotation (Orthogonal & Oblique) |
|
Definition
- Any of several methods in factor analysis by which the RSer attempts (by transformation of loadings) to relate the calculated factors to theoretical entities
- Oblique: A rotation (transformation) of the ID'd factors that yield correlated (oblique) factors
- Orthogonal: A rotation of the ID'd factors that yields uncorrelated factors
-Orthogonal axes are at right angles to each other.
- The original correlation table determines only the position of the tests in relation to each other
-Position of the reference axes is not fixed by the data: the same points can be plotted with the reference axes in any position
|
|
|
Term
|
Definition
Rotations clear up the focus of the data by giving up some magnification and finding the best fit for all the retained factors, even if it means increasing or decreasing the importance of each factor |
|
|
Term
|
Definition
Orthogonal (independent, each factor has zero correlation with others)
Most common rotation that “wiggles around” after the first factor is set so the others can get a best fit |
|
|
Term
|
Definition
aka “characteristic root” “latent root”
Usually symbolized lamda [ L ]
A statistic used in factor analysis to indicate how much of the variation in the original group of variables is accounted for by a particular factor
It is the sum of the squared factor loadings of a factor.
Eigenvalues of less than 1.0 are usually not considered significant.
Have similar uses in canonical correlation analysis and principal component analysis
Each factor (clustered items) gets an eigenvalue score that tells you the amount of variance among all items that this one factor accounts for
• Low values would mean that not many variables clustered together in this particular grouping/factor |
|
|
Term
|
Definition
1. Scree Plot: the plot will be steep for the first few factors and then level off; You do not want to keep the factors that have leveled off Plot of the eigenvalues for each factor/component that was created by lumping variables together • As the plot levels, each factor is explaining less unique information • Theory (research shows a certain number of factors is best!) 2. Kaiser Rule |
|
|
Term
|
Definition
-FA conducted to discover what latent variables (factors) are behind a set of variables or measures. 1. Not for hypothesis testing 2. Used in test construction (creating subscores on assessments) 3. Used in empirical exploration (study brand new areas to see what symptoms cluster together) 4. Used for data reduction/reduce # of DVs (if you have 12 measures of depression and several are highly intercorrelated, identify which ones cluster together so you can eliminate inventories) |
|
|
Term
|
Definition
-Tests theories and hypotheses about the factors one expects to find -FA conducted to test hypotheses (or confirm theories) about the factors one expects to find. -A type of or element of structural equation modeling. |
|
|
Term
|
Definition
The correlations between each variable and each factor in a factor analysis. -Analogous to regression (slope) coefficients. -The higher the loading, the closer the association of the item with the group of items that make up the factor. -Loadings of less than .3 or .4 are generally not considered meaningful. |
|
|
Term
|
Definition
-Like MR, except you are predicting an event or bx that either occurs or not (CATEGORICAL), whereas MR predict an outcome on a continuum -eg. If you were to predict whether or not each subject were to lose weight, you would run a LR BC the outcome is dichotomous; If you were to predict how much weight loss a person achieved based on the previous 3 variables discussed, you would use MR 1. NO IV 2. > 2 DVs a. DVs can be split up into one outcome (Criterion) and the rest are predictors b. Outcome if Categorical/dichotomous and predictors can be continous or dichotomous |
|
|
Term
What stats do you use with a Logistic Regression |
|
Definition
**Weights are assigned to each predictor to be put into a prediction equation 1. Chi-Square, 2-Log Likelihood, Signifiance Level: Tells us if the whole model (all predictors lumped together) are significantly predicting the outcome -Should be significant
2. Cox & Snell: "Pseudo R Squared": Gives a range of the variance in the outcome that is explained by our model (predictors) 3. Hosmer & Lemeshow: "Goodness of Fit Chi Square": Tells if the predictions we are making fit the actual data we collected, if there are a lot of discrepencies, then you must rething the model -This should NOT BE significant
4. Predicted Probability: Takes the score of each equation and plugs it into an equation in order to yield a probability value b/n 0 & 1; Graph the predicted probability and compare it against the observed score for each individual
6. Classification Table: Includes Observed & Predicted values; Goal is to correctly classify as many as possible |
|
|
Term
What is "sensitivity" WRT Logistic Regression? |
|
Definition
-Ability to detect the presence of the outcome: TRUE POSITIVES -Mistakes are considered a TI Error -Ability of a diagnostic test to correctly ID the presence of a disease or condition -Conditional probability of the test giving a positive result if the subjects do have the condition or disease |
|
|
Term
|
Definition
-Ability to detect the absence of the outcome (True Negative) -Mistakes are considered a TII Error -Low specificity: Trouble detecting ansence -> results in false positives -COnditional probability of a test giving a negative result when patients or subjects do not have a disease |
|
|
Term
Assumptions of Logistic Regression |
|
Definition
** Almost impossible to violate assumptions -- very leniant! 1. Independence of Observations 2. Outcome must be dichotomous 3. Need large sample size (approx 20-50/predictor)
--> May run Discriminat Analysis instead of LR, but requires meeting more assumptions that LR |
|
|
Term
Discriminant Analysis Assumptions |
|
Definition
1. Normality 2. Linearity 3. Heteroskedasticity 4. Independence of Observation > more powerful |
|
|
Term
Contrast to Discriminant Analysis: ADVANTAGES of Logistic Regression |
|
Definition
1. No assumptions 2. More flexible 3. No negative probabilities 4. Good for variables of all types 5. Less limitations 6. Good when you expect an IV is non-linear
**DISADVANTAGE: Lose a little power w/Dichotomous output |
|
|
Term
|
Definition
-Measure of association: Unlike others BC 1.0 means thre is NO relationship b/n the variables -Adjusted OR is an OR computed after having controlled for the effects of other predictor variables (Unadjusted would be a bivariate OR) |
|
|
Term
|
Definition
-Extension of ANOVA - multiple dependent variables --> Lumps them together into one DV -Look for differences b/n 2+ grps + 2+ DVs -Allows the simultaneous study of 2+ related DVs while controlling for the correlations among them -If DVs are NOT related, DON'T do a MANOVA; Do separate ANOVAs for each unrelated DV |
|
|
Term
What are the ADVANTAGES of a MANOVA?` |
|
Definition
1. Reduces TI Error, in comparison to ANOVA (in ANOVA, run several separate analyses - Familywise/Experimentwise error) 2. Increases Power 3. Takes into account the correlations b/n DVs |
|
|
Term
Multivariate Tests of Significance |
|
Definition
-Creates a new DV from the set of correlated DVs (lumps them all into one DV) --> Synthetic Variable --> Compares SV across all levels of the IV
-4 Tests: 1. Wilk's Lambda: Tells us the variance in the synthetic variable that is accounted for by the IV (more intuitive meaning) 2. Roy's Largest Root: Highly sensitive to only the most important synthetic factor so don't use this if you are interested in other dimensions 3. Lawley-Hotelling 4. Pillai's Trace |
|
|
Term
Rough rules of thumb for selecting the most appropriate MANOVA test |
|
Definition
- Roy’s GCR>> this test should be employed to confirm a hypothesis of one single dimension (or one predominant factor in the dependent variable set)
2. Wilk’s lambda: This test is maximally sensitive when two or more dimensions are contained in the set of dependent variables and are of relatively equal importance in accounting for the trace
3. Lawley-Hotelling trace & Pillai’s trace: These two test criteria appear to be intermediate in sensitivity when compared with Roy’s GCR and Wilk’s Lambda. However, there is evidence that Pillai’s trace criterion may be more robust to lack of homogeneityof dispersion matrices than the other 3 MANOVA criteria |
|
|
Term
4 Reasons to use a Multivariate Analysis |
|
Definition
1. Use of univariate tests (ANOVA) leads to Inflated overall TI Error rate and probability of at least one fals rejection 2. UNivariate tests ignore important information: the correlation among the variable --> multivariate tests incorporate the correlation right into the test statistic 3. Although the grps may not be statistically significant on any of the variables individually, jointly the set of variables may be reliably differentiate the groups --> Small differences on several of the variables may combine to produce a reliable overall difference -- Multivariate tests may be more POWERFUL 4. First compare grps on total test score to see if there is a difference; Then compare the grps further on subtest scores to locate source responsible for the global difference --> If NO difference then STOP and use a MANOVA as a gate keeping fxn; If significant, you can run a univariate |
|
|
Term
What are the assumptions on an ANOVA? |
|
Definition
1. Normality: Observations are normally distributed on the DV in each group 2. Homogeneity of Variance: Population variances for the grps are equal 3. Independence of Observations 4. Homogeneity of Variance/Covariance: The covariance (variance shared b/n variables) for each pair of DVs is the same across levels of the IV; Relationship b/n DVs stays the same across levels of IV -Use Box TEst in SPSS |
|
|
Term
|
Definition
-Used when there is a covariate you need to control for (variable that is correlated to the DV but not with the IV) -Assumptions: Same as MANOVA and ANCOVA: a) Justification for use b) Synthetic variables c) Multivariate tests of significance d) Assumptions |
|
|
Term
As you increase sample size, which type of error decreases? |
|
Definition
Type II (Accept a false Null Hypothesis) |
|
|
Term
Power of a Statistical Test |
|
Definition
-Ability of a technique to detect relationships -Probability of rejecting a Null Hypothesis when it is False and should be rejected -To Calculate: 1-Probability of a TII Error -Used to determine minimum sample size |
|
|
Term
|
Definition
-How sensitive the design is to the effects we want to find --> If true effects really do exist, will they be found -Power = sample size (effect size)/ WG variance -More POWER: 1. Large Sample 2. Large Effect 3. Low Error 4. Large Magnitude |
|
|
Term
Interpretation fo Significance Testing |
|
Definition
Significance: tells us how sure we are that the differences found between groups are real Alpha < .05 states that we are 95% sure the differences found are real Alpha < .01 states we are 99% sure the differences are real Affected by sample size: the smaller the sample, the harder to get significance |
|
|
Term
|
Definition
A range of values of a sample statistic that is likely (at a given level of probability, called a confidence level) to contain a population parameter. |
|
|
Term
|
Definition
desired percentage of the scores (often 95% or 99%) that would fall within a certain range of confidence limits. It is calculated by subtracting the alpha level from 1 and multiplying the result times 100; e.g., 100 X 1 (1-.05) = 95%. |
|
|
Term
Scales of Measurement: Nominal/Categorical Variables |
|
Definition
eg. Gender. Ethnicity. Marital Status. |
|
|
Term
What are 4 scales of measurement? |
|
Definition
1. Nominal/Categorical 2. Ordinal 3. Interval 4. Ratio |
|
|
Term
Scales of Measurement: Ordinal |
|
Definition
-The rank order of anything. -Ordering, ranking, or rank ordering; the ordinal scale of measurement represents the ranks of a variable's values. Values measured on an ordinal scale contain information about their relationship to other values only in terms of whether they are "greater than" or "less than" other values but not in terms of "how much greater" or "how much smaller." Movie ratings (0, 1 or 2 thumbs up) SES. Ratings (good, choice, prime). |
|
|
Term
Scales of Measurement: Interval |
|
Definition
This scale of measurement allows you to not only rank order the items that are measured, but also to quantify and compare the sizes of differences between them (no absolute zero is required). This is typically the type of data you use for dissertations in which the score is on a continuous scale. eg. Any scores found on the Beck Depression Inventory, Ratings on Stress Level, Hours of Sleep per Night, etc are all continuous scores. |
|
|
Term
Scales of Measurement: Ratio Variables |
|
Definition
The added power of a rational zero allows ratios of numbers to be meaningfully interpreted; i.e. the ratio of John's height to Mary's height is 1.32, whereas this is not possible with interval scales. eg. Degrees K. Annual income in dollars. Length or distance in centimeters, inches, miles, etc. |
|
|
Term
Interpretation of Measures: Transforming scores |
|
Definition
-BC raw scores are not helpful -Standardized Scores: Allows you to compare scores on different tests: T-scores & Z-scores -Percentiles: Percent of ppl who scored below you -->Advan: Indicates a person's relative position -->Disadvan: Cannot compare 50-60th percentile difference to 80-90th percentile difference BC a larger grp (more differences) fall in the 50-60th percentile |
|
|
Term
Interpretation of Measures: Creation of Norms |
|
Definition
Created/established by administering tests to a sample that is representative of the population of interest |
|
|
Term
Interpretation of Measures: Appropriate Use of Norms |
|
Definition
Do not use test norms (at least be cautious) if the individual is not represented in the normative sample |
|
|
Term
Interpretation of Measures: Criterion-Referenced |
|
Definition
aka "content-referenced tests" -How each person performs, based on a criterion/outcome -Determines if they learn the material -Mastery: All or none score (comps/licensing exam) that assess a content area -Diff. from norm-referenced test: Measures absolute levels of achievement -Students' scores are not dependent upon comparisons with the performance of other students |
|
|
Term
Interpretation of Measures: Norm Referenced |
|
Definition
-Test in which the scores are calculated on the basis of how subjects did in comparison to (relative to) others taking the test (others' scores provide the norm or standard) -Score is relative to those in normative sample -Tests for individual differences -Alternative is some absolute standard or criterion |
|
|
Term
What are Polychotomous Scales Used for? What are some examples? |
|
Definition
-Usually used to assess attitudes 1. Thurstone 2. Guttman 3. Likert 4. Semantic Differential (Osgood) |
|
|
Term
Polychotomous Scales: Thurstone |
|
Definition
-Method of creating and scoring a questionnaire -Many statements (100, for example) are presented to a group of judges that express a range of attitudes about a certain subject -Then, the group of judges sorts the statements into 11 groups that classify them as similar attitudes (kind of creates subscales that lump certain questions together) -Subject’s score depends on the number (1-11) associated with the statement they endorse |
|
|
Term
Polychotomous Scales: Guttman |
|
Definition
-A set of statements about a topic from which you choose to endores one statement -Endorsement of a statement implies that you would endorse all other milder statements -eg. Endorsement of "I have filed for divorce" implies you would endorse "I have occasionally thought of divorce" |
|
|
Term
Polychotomous Scales: Likert Scale |
|
Definition
-Opinion statement on how much you agree v. disagree -5-7 point continuum |
|
|
Term
Polychotomous Scales: Semantic Differential (Osgood) |
|
Definition
-Each concept is rated on a 7 pt scale indicating which opposite the construct is more closely related to a. Evaluative: Good v. Bad, Valuable v. Worthless, Clean v. Dirty b. Potency: Strong v. Weak, Large v. Small, Heavy v. Light c. Activity: Slow v. Fast, Active v. Passive, Sharp v. Dull |
|
|
Term
Confidence Intervals and Standard Error of Meas't (SEM) |
|
Definition
-Error is randomly and normally distributed so we don't know where a person's "true" score falls based on the "obtained" score --> Must determine the reliability of the test in whcih the subject receives the score (SEM) -SEM = (SD) (sqrt of (1-r)) r= internal consistency (reliability of the measure) SD = SD of the test scores -Want to be right 95% of the time (1.96 SD in either direction) so... SEM x 1.96 = Confidence Interval |
|
|
Term
Reliability of Measurement |
|
Definition
-Refers to the consistency of scores obtained by the same person when they are reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions 1. Test-retest reliability: Indicates the extent to which the individual differences in test scores are attributable to "true" differences in the characteristics under consideration and the extent to which they are attributable to chance errors; Make it possible to estimate what proportion of the total variance of test scores is ERROR VARIANCE -Expressed as a correlation Coefficient: BC all types of reliability are concerned w/ the degree of consistency or agreement b/n 2 independently derived sets of scores --> Expresses the degre of correspondence, or relationship b/n, two sets of scores |
|
|
Term
|
Definition
-Used to interpret group scores by testing the consistency across the group/population NOT Individuals -Pearson r, Spearman Brown, KR-20, Cronback Alpha, Cohen's Kappa |
|
|
Term
Reliability: Change/Difference Scores: Pre-Post |
|
Definition
1. Unreliable BC you are taking items away 2. If using this, use SEM or reliability calculations 3. Notoriously unreliable BC doesn't account for error in both of the scores |
|
|
Term
What are 4 types of reliability? |
|
Definition
1. Stabilty, test-retest 2. Equivalence, Parallel forms 3. Homogeneity: Internal Consistency 4. Inter-rater |
|
|
Term
|
Definition
-Consistnecy of a measure over time -Administer the same test to the same grp twice -Correlation depends on: 1. Time b/n administrations (eg. should be long enough to avoid practice/carryover effects but short enough so nothing happened to the construct) 2. Construct being measured (eg. if the construct is stable like IQ, use a longer interval, but if unstable like bx, use a shorter interval) |
|
|
Term
Reliability: Equivalence, Parallel forms |
|
Definition
-Use alternative forms of the test to avoid difficulties from test-retest-reliability -The same persons can be tested with one form on the first occasion and with another, equivalent on the second -Correlation b/n the 2 scores obtained on the 2 forms represents the reliability coefficient of the test -Measures: 1. Temporal stability 2. COnsistency of a response to different item samples/test forms
LIMITATIONS: 1. Reduce, not eliminate practice effects 2. Motivation has influence 3. Few tests have alternate forms
**Alternate Form (equivalence): Consistency across forms of the same instrument 1. Used when you create 2 forms of the same test (prevents cheating and practice effects) 2. Tests are identical in format and construct but have different content 3. One version is self-report and the other is standardized |
|
|
Term
Reliability: Homogeneity, Internal Consistency |
|
Definition
-Interval based: Type of inter-rater reliability (each piece of material to be coded is broken into intervals and then each interval is scored for either an occurrence or nonoccrrence of what you're interested in -Overall Percent Agreement: Used for rating single observers - Easy to calculate, but inflated by chance agreement - % occurrence = A/A+B+C = 1/3 = 33% - % nonoccurrence = D/B+C+D = 7/9 = 78%
Cohen’s Kappa: corrects for chance agreement, smaller/stringent estimate (A+D/A+B+C+D) -Higher scores suggest higher internal consistency
Session Total (each piece of material to be coded is NOT broken into intervals. Instead, you get a total session score for the target behavior) --> Use Intraclass Correlation |
|
|
Term
Reliability: Multiple Observers |
|
Definition
Multiple Observers: All data is coded by 2+ observers and data is then averaged or summed 1. Calculate reliability of the average, not individual scores 2. Average correlation of all pairs of observers 3. A formula calculates how many raters you will need to get reliable resuls -Add items --> Increase r -Add observers --> Increase r -Use Spearman Brown |
|
|
Term
Reliability: Split Half (homogeneity) |
|
Definition
aka "coefficient of internal consistency" BC only one administration of a single form is required -Consistency b/n 2 halves of the same instrument -Systematically (even v. odd, beginning v. end) or randomly split the test items -2 scores are obtained for each person by dividing the test into equivalent halves -If run a Pearson r, you are correlating only half the items, which will decrease r --> SOlution: Add items (longer = stronger) or use Spearman Brown correlation BC it estimates the reliability of the entire test |
|
|
Term
Reliability: Internal Consistency (homogeneity) |
|
Definition
-COnsistency b/n items/content -Measures how items on a measure are correlated with each other -CHronback Alpha (continuous items) or KR-20 dichotomous items -Average of all possible split halves --> keep averaging until you removed or added enough items to maximize the alpha value
**INFLUNCED BY: 1. Magnitude (degree of) correlation among items 2. Length of test (longer = stronger) |
|
|
Term
|
Definition
-Consistency across scorers -Used any time there is subjective human judgment 1. COnsensual Drift: Observers talk and influence each other 2. Individual Drift: Individual interpreataions influence observations over time --> COntrol drift by calculating reliability during training as well as throughout the study |
|
|
Term
List the Reliability models |
|
Definition
1. True-Score Theory - Classical Meas't Theory 2. Domain Sampling Model: Generalizability Theory |
|
|
Term
Reliability Model: True-score theory |
|
Definition
aka Classical meas't Theory -Meas't of a psychological construct will yield a score on a measure that is reasonably stable and fixed -Observed Score = True + Error -Error is random and normally distributed (just as likely to overestimate or underestimate scores)
-True score: Score thought to consist of the true score plus or minus random meas't error --> If errors are random, they will cancel each other out in the long run and yield the true score --> true score can be assumed to be the mean of a large number of meas'ts |
|
|
Term
Reliability: Domain Sampling Model Generalizability Theory |
|
Definition
-ID different sources of error in a measure rather than simply estimating total error -Alternative way to estimate reliability suggested by Cronback -Rather than having an observed score that is imperfect, recognize that we have a bunch of different observed scores based on different circumstances -Under same circumstances you would expect similar or different results |
|
|
Term
Reliability: Domain Sampling Model Domain Sampling |
|
Definition
-Sampling items, such as Qs on a questionnaire, in a particular subjet area or domain (use sample Qs in a domain, instead of all Qs on a survey) |
|
|
Term
Relationship of Reliability to Other Features |
|
Definition
1. Test Length: Longer = Stronger 2. Composite of Measures 3. Sample selection |
|
|
Term
Relationship of Validity to Reliability |
|
Definition
1. Validity will always be lower BC reliability sets limits (MAXIMUM VALIDITY COEFFICIENT) -If unreliable --> Validity = LOW 2. Standard Error of Meas't/Estimate -Used when interpreting individual scores -SEM: Estimates a band of error so that we can ID where a person's "true score" lies -Observed Score = True Score + Error (Randomly and systematically distributed) _CI: Range around the observed score where the true score is likely to fall
3. Correction of Attenuation -Test Length: Longer = stronger -Restriction of Range: Occurs when you do not have the full range of scores for that population (eg. only get experts), the correlation is then attenuated (or lower) |
|
|
Term
|
Definition
The extent to which a measure appears to assess the construct of interest. The more face valid, the more subject to malingering the test may be. -No Stats used, just "yes" and "no" |
|
|
Term
|
Definition
Evidence that the content of the items measures the underlying construct of interest Usually established by using expert raters and/or systematic survey of literature Contributions to Poor Content Validity Putting in irrelevant items to the construct Leaving out relevant items Wrong balance of items Content Validity of Behavioral Measures In theory, behaviors can be fully operationalized – content validity is crucial Content Validity of Construct Measure Construct can never be fully operationalized – content must sample the entire domain |
|
|
Term
Criterion-Related Validity |
|
Definition
Utility of the test Forecasting Efficiency Criterion validity concerns the utility of the test How well can we use a test for a particular purpose Used for classification and prediction Criterion measure against which scores are validated are either obtained at the same time, or after a stated interval Ideally you want perfect prediction – but in reality, you expect about a .6 correlation |
|
|
Term
|
Definition
Criterion measure and your measure are given at approximately the same time Relevant to tests made to diagnose or classify existing status, rather than predicting future outcomes Interested in the utility of using your mz in place of the criterion |
|
|
Term
|
Definition
Criterion mz and your mz are given at different times Relevant to test designed to predict an ind’s future performance on the criterion mz Typically used in the selection and classification of personnel |
|
|
Term
|
Definition
Problem with criterion validation Knowledge of test scores influence a person’s criterion status I.e.. A college professor knows a student scored poorly on an aptitude test influences the grade that the student is given |
|
|
Term
|
Definition
-There are a variety of criteria that test can be validated against -The criteria depends on the purpose of the test -Strategies for assessing validity of IQ tests (Academic Achievement -Strategies for assessing the validity of aptitude tests -Performance in specialized training (Job performance, Instructor ratings) -Validation by method of contrasted groups: Criterion is based on “survival” within a particular vs elimination from it -->Used in the validation of personality tests (eg. A test of social traits might be validated by comparing scores of salespersons and stoking clerks on the measure)
-Validation by diagnosis: Also used to assess validity of personality tests -Correlation btw new test and previously available test -New test is abbreviated or simplified version of currently available test |
|
|
Term
|
Definition
-Overlap between different tests that presumably measure the same construct -correlation between scores on 2 different measures that assess the same (want a high correlation) |
|
|
Term
|
Definition
correlation between scores on 2 different measures that assess 2 different constructs (depression versus antisocial pd) -Want low correlation |
|
|
Term
What do you need to do a MTMM Analysis? |
|
Definition
1. Unrelated constructs (we need at least 2) 2. To assess each of the constructs with different methods (sometimes when using ratings by others will work if you use different informants) 3. Assess each construct assessed by at least 2 methods |
|
|
Term
|
Definition
aka "Pearson chi-square, X2, chi2"
-Single-Sample test: Used when the study involves one IV and 2+ Independent grps (dichotomy)
-Multiple sample: Used when study involves 2+ IVs and multiple independent groups --> Evaluates if 2+ variables measured on a nominal level are independent of one another, or if one variable is contingent upon the other
|
|
|