Shared Flashcard Set

Details

JA EPPP stats
JA EPPP stats
47
Psychology
Post-Graduate
01/01/2014

Additional Psychology Flashcards

 


 

Cards

Term
Attenuation
Definition
Attenuation correction formula: 1. rx’y’= rxy/ SQRT(rxx*ryy). 2. Estimates true correlation between two variables measured without error. 3. Requires correlation coefficient (rxy) and a reliability coefficient for each variable (rxxandryy). 4. With correlation of .6 and reliability coefficient of .8 for both variables: rx’y’= .6 / SQRT(.8*.8) = .6 / .8 = .75. 5. If absolute value of result is greater than one, round to one (Pearson's r ranges from -1 to 1)
Term
Demand Characteristics
Definition
Demand Characteristics: Cues associated with the manipulation or intervention that may affect the results of the experiment. 1. Cues which lead participants to guess about nature of the experiment and alter their actions according to their expectations. information provided to participants prior to arrival. experiment instructions or procedures. other features such as experimenter demeanor. 2. Possible reason: Participants assume they are being evaluated, and shift their responses in such a way as to obtain a positive evaluation based upon what they believe the experiment to be about. 3. May prove confounding if influence participants in a manner unintended by the experimenter. 4. To reduce likelihood:. standardize instructions. administer via a computer
Term
Mixed-Method Design
Definition
1) Qualitative and quantitative data .2) Address limitations of one type of data with strengths of other type. limited generalizability of qualitative data covered by external validity of quantitative data (e.g., larger, more representative sample; allows adjusting for extraneous variables). narrowed focus of quantitative data complemented by "big picture" of qualitative data. 3) Four general designs. explanatory: qualitative data used to clarify quantitative data. exploratory: quantitative data used to clarify qualitative data. triangulation: strengths of both methods used to compare, contrast, and validate findings. embedded: one method has priority over the other
Term
Structural Equation Modeling (SEM)
Definition
Structural Equation Modeling (SEM): Technique for building and testing statistical models. 1. Uses factor analysis, path analysis, and regression 2 Tests a hypothesized model. 3. Evaluates the causal/predictive influence of manifest (directly observed) and latent (indirectly observed) independent variables. 4. Two-step process: validates measurement model with confirmatory factor analysis. tests structural model with path analysis. 5. More powerful than multiple regression because it incorporates: multiple latent and manifest independent variables and dependent variables (perhaps measured with multiple indicators). measurement error and correlated error terms. interactions and nonlinearities. 6. Particular strengths:. models latent variables. flexible assumptions. can test overall model rather than focusing on individual coefficients
Term
Sampling Error
Definition
Sampling Error is the difference between the sample statistic and the corresponding population parameter 1) Samples do not present a complete picture of the population they are meant to represent (e.g., a disproportionate number of women or men could be obtained in a sample). 2) It is very likely that a sample statistic under- or overestimates the population parameter rather than hitting it exactly (with a normal distribution, there's a 50 percent chance that a sample mean overestimates the population mean and a 50 percent chance it actually underestimates the population mean). 3) Larger sample sizes reduce Sampling Error by including more information about the population and reducing uncertainty
Term
Standard Error
Definition
1. Measure the average or standard distance between a sample mean and population mean. 2. Defines and measures the sampling error. 3. Error is expressed in terms of standard deviation units. 4. Calclulated by dividing the standard deviation by the square root of the sample size. Standard Error of the Mean. 1. Estimates how much the sample mean will deviate from the population mean due to sampling error. 2. When population size goes up, sample size goes down and Standard Error of the Mean will go up. Standard Error of Measurement. 1. Constructs confidence intervals for test scores. 2. How much the person's score is expected to vary from the score the person is capable of receiving based on actual ability. Standard Error of the Estimate. 1. Standard deviation of observed values around predicted values on a regression line (measure of prediction error). 2. Higher Standard Deviation is higher SEE
Term
Multiple-Baseline Design
Definition
Multiple-Baseline Designs attempt to replicate treatment effects across different behaviors, people, or settings. 1. If treatment is effective, change occurs only when it is introduced. 2. Differs from ABAB design in that ABAB examines one behavior while introducing, removing, and reintroducing an intervention to one person in one setting. 3. May be used when a return to baseline is not possible (e.g., can’t unlearn how to ride a bike). 4. Provides flexibility. Extension of findings to other people, places, and behaviors is a necessary component of the design. If intervention is not effective, researchers may make efforts to improve before extending to other people, behaviors, or settings
Term
Solomon Four-Group Design
Definition
Solomon Four-Group Design: Controls for practice effects by randomly assigning participants to four groups. 1) Experimental groups. pre-/post-test: participants complete pre-test, receive intervention, and take post-test. post-test only: participants receive intervention and take post-test. 2) Control groups. pre-/post-test: participants complete both pre- and post-tests (but do not receive intervention). post-test only: participants complete post-test (but do not receive intervention). 3) May determine whether practice effects exist by searching for differences across testing levels (i.e., post-test only versus pre-/post-test). 4) Rich results from replicating experimental and control conditions. 5) Requires extra time and money for more groups
Term
Meta-Analysis
Definition
Developer: Karl Pearson developed effect sizes and meta-analysis. Key Figures: Smith, Glass, and Miller (1980). Purpose: To analyze results of multiple studies, in order to decrease the subjectivity of individual narratives. First study applying meta-analysis to psychology: Determined psychotherapy more effective than no treatment. First step in meta-analysis: Thorough literature review (summary of the prior research). Second step: Put results into a common scale (effect size). Central problem: Which studies are well-controlled and should be included and which should not? Key figure of criticism: O’Leary and Wilson (1987).
Term
Single-Subject Design
Definition
Single-Subject Design: Used in behavior modification studies in which the intervention may be evaluated with a single subject. 1) Establishes causality, minimizes threats to validity. 2) Large amount of variation in dependent variable a potential threat. 3) Preferred for ethical reasons (conclusions may be drawn without removing treatment). 4) Common applications. ABAB design: baseline established, then treatment introduced, removed, and reintroduced. multiple baselines: treatment effect examined with different behaviors, people, or settings. 5) Four key characteristics. continuous assessment: measured at several different times before and after treatment introduced. baseline assessment: measurements taken before treatment introduced in order to establish pre-existing trend. stability of performance: measurements made until stable levels obtained. different phases: baseline, treatment, and possibly additional treatment phases are used to examine efficacy of an intervention. 5) Nonstatistical evaluation. mean changes. level changes (change from last measurement of one phase and first measurement of the next). slope (or trend) changes. latency of change (speed with which change occurs upon phase change). 6) Problems. autocorrelation: with repeated measurements of the same variable with the same participant, observations drawn at different times may be correlated. practice effects. lengthy time requirement. potential lack of generalizability (conclusions drawn from one person may not apply to another)
Term
Counterbalanced Design
Definition
Counterbalanced Design: Administer multiple treatments to each participant in different sequences to prevent order effects (e.g., treatment A looks to have greater effect because it is always administered first). 1) For two treatments A and B: AB, BA. 2) For three treatments A, B and C: ABC, ACB, BAC, BCA, CAB, CBA. 3) With larger number of treatments, number of possible sequences increases. 4) With small sample, treatments might not appear in a given position an equal number of times. 5) Latin square design: select a smaller number of sequences in which a given treatment appears only once in each position. for three treatments A, B, and C: ABC, CAB, BCA (A, B, and C each appear only once in first, second, and third positions). 6) Advantage. may analyze between-subject variable of treatment sequence. 7) Disadvantages. may require more participants. may require more sophisticated analysis
Term
Matched-Subjects Design
Definition
1. Each participant in one sample is matched with a participant in another sample with respect to a specific variable (e.g., socioeconomic status). 2. Renders conditions nearly equivalent on matched variable. 3. Increases power of study by reducing error variance (so long as matching variable is associated with the dependent variable). 4. Might decrease external validity by not generalizing to others not demonstrating the matching variable. 5. Process of matching could tip participants off to research hypothesis
Term
Variables
Definition
Unlike constants, which remain the same, variables take on many values. 1) Independent or predictor Variable (IV). manipulated by researcher, presumed agent of change (i.e., affects dependent Variable). 2) Dependent or criterion Variable (DV). measured by researcher to determine if IV has an effect. 3) Confounding Variable. extraneous variable varying systematically with the IV and reducing internal validity (i.e., the ability of the researcher to claim differences in DV are due to the IV). 4) Quasi-independent Variable. IV in quasi-experiment (i.e., experiment using existing groups rather than random assignment in determining condition). 5) Measurement scales: nominal Variable: a label or category (e.g., political party). ordinal Variable: data are ranked, or possess order (e.g., class rank). interval Variable: ranked, meaningful differences between values (e.g., Fahrenheit temperature scale). ratio Variable: ranked, meaningful differences between values, and the value of zero signifies absence of what is measured (e.g., Kelvin temperature scale)
Term
Validity
Definition
Construct Validity. 1) Convergent Validity: test correlated with other tests of same or similar trait. 2) Divergent Validity: test not correlated with tests of unrelated traits. 3) Factorial Validity: using factorial analysis to show validity. Factors that should correlate do and factors that should not correlate do not.. Criterion-related Validity. 1) Concurrent Validity: test correlated with criterion variable measured at same time (e.g., SAT score and high school GPA). 2) Predictive Validity: test correlated with criterion variable measured at a future time (e.g., SAT score and college GPA). Content Validity: Adequacy of test in measuring all facets of a construct or trait. Face Validity: Gauged by those taking the test, the extent to which a test seems to measure what it is meant to measure. A valid measure is reliable, but a reliable measure isn’t necessarily valid (e.g., a broken watch)
Term
External Validity
Definition
External Validity: the extent to which research findings may be extended to other people, places, and situations. Five main threats: 1. Interactions of different treatments: The interaction effect between factors is different from the main effects. example: A new instructional method promotes greater learning, but only among those receiving a motivational incentive. examine interaction and main effects. 2. Interaction of testing and treatment: Measurement sensitizes participants to, or inoculates them against, the treatment. include a post-test-only control condition. 3. Interaction of selection and treatment: Different effects of a treatment for different types of people. example: An experimental drug works for women, but not for men. obtain as heterogeneous sample as possible. 4. Interaction of setting and treatment: Different effects of a treatment in different settings (e.g., classroom versus combat zone). perform study in all relevant settings. 5. Interaction of history and treatment: Different effects of a treatment at different times (e.g., pre- vs. post-9/11). replicate study at different times. perform literature review for earlier findings. 6. Reactivity: Changes in behavior as a result of a person being observed. 7. Demand Characteristics: Refers to when a participant behaves according to what they think is expected. 8. Carryover Effects: Occurs when the effects of one treatment are carried over to the next. 9. Sequence or Order Effects: When the order of treatment in a series influences the participant response. 10. Counterbalancing: A control measure to order effects by presenting treatments in a variety of sequences: within-subject and within-group.
Term
Internal Validity
Definition
Internal Validity: The extent to which a research study rules out alternative explanations and establishes causality. 1. If not seeking to establish that A causes B, then Internal Validity is not a concern. 2. If seeking to establish that A causes B, then Internal Validity is of paramount importance (more of a concern than external validity). 3. Supports: Random assignment is the most effective manner of obtaining Internal Validity. Matching: participants are matched across conditions on a characteristic related to the outcome variable. Blocking: a characteristic related to the outcome variable is incorporated into the design as an additional independent variable. 4. Threats to Internal Validity. History: an extraneous event occurs during the study. Maturation: respondents change systematically over the course of a study. Testing: initial measurement affects subsequent measurements. Instrumentation: changes in the measure may cloud results. Regression to the mean: extreme measures tend to be less extreme on subsequent measurements. Selection: participants differ systematically by experimental group before the intervention. Mortality (or dropouts): different dropout rates across conditions render experimental groups non-equivalent. Interactions with selection: several of the above threats may interact with selection and be mistaken for treatment effects. Ambiguity about the direction of causation: non-experimental studies leave debatable whether A causes B, B causes A, or a third variable C causes both A and B. 5) Ways to control extraneous variables: a) Elimination: Complete removal of variable. b) Constancy: Variable experienced by all participants. c) Balancing: Matched-pairs designed used to evenly distribute variables between groups.
Term
Reliability
Definition
1. Extent to which a measure or test is consistent and repeatable a. Necessary, but not sufficient, for validity Determination of Reliability of a test measure uses several methods 1. Internal Consistency: All items measure same thing a. Reliability Coefficient b. Split-Half c. Cronbach'sAlpha(Coefficient Alpha) d. Kuder-Richardson's Formula (KR20) 2. Consistency between alternative forms a. Coefficient of equivalence 3. Test-Retest consistency a. Coefficient of stability Other forms of Reliability 1. Inter-rater Reliability a. Kappa Statistic is used with nominal or ordinal data
Term
Measures of Central Tendency
Definition
Measures of Central Tendency (i.e., where bulk of distribution is centered) Mode (most frequently occurring value) 1) Distribution may feature one, two, or more modes 2) Advantages applicable to all measurement scales (e.g., nominal, ratio). unaffected by extreme values/outliers 3) Disadvantages does not lend itself to algebraic manipulation (i.e., being placed in equations) Median (middle value) 1) Advantages applicable to all but nominal data. resistant to extreme values/outliers 2) Disadvantages does not lend itself to algebraic manipulation. Mean (average of all values) 1) Advantages lends itself to algebraic manipulation. more stable estimate of central tendency (i.e., less sample-to-sample variation) 2) Disadvantages requires interval or ratio variable (i.e., needs meaningful differences between values). affected by extreme values/outliers Cross-distribution comparisons 1) Normal distribution: mode = median = mean 2) Positively skewed: mode < median < mean 3) Negatively skewed: mean < median < mode
Term
Constants
Definition
A Constant is a fixed, unvarying value (e.g., 5) 1) When a Constant is added to or subtracted from a variable: measures of central tendency (e.g., median, mean) change similarly; adding five to every value results in a mean five higher than the original mean measures of variability (e.g., range, standard deviation) remain the same; adding five to every value results in a standard deviation no different from the original standard deviation 2) When you multiply or divide by a Constant: measures of both central tendency and variability change; multiplying each value by five results in a mean and a standard deviation five times larger than the original statistics 3) When you add, subtract, multiply, or divide by a Constant: the shape of distribution remains the same; adding five to every value in a skewed distribution results in a similarly skewed distribution. correlations with other variables remain the same;adding five to every value in one variable does not alter its relationship with another variable
Term
Standard Score (z-score)
Definition
Z-scores (Standard Scores) are obtained by transforming raw scores to obtain a distribution with a mean of 0 and a standard deviation of 1 1. Z-score indicates how many standard deviations from the mean a score is (e.g., a z-score of -1.5 indicates a raw score 1.5 standard deviations below the mean) 2. Permits comparisons across different measures and tests (e.g., students in different classes performed equally well relative to other students on an exam if both have the same z-score) 3. Transforming raw to z-scores does not alter the shape of the new distribution from that of the original distribution (i.e., the distribution of z-scores will look the same as that of the raw scores) 4. To either side of the mean: a. 50 percent of the distribution falls to either side of the mean  b. 34 percent between the mean and one standard deviation  c. 14 percent between one and two standard deviations from the mean d. 2 percent between two and three standard deviations from the mean Other Standard Scores 1. T-scores (raw scores standardized to a distribution with a mean of 50 and a standard deviation of 10) 2. ETS scores (raw scores standardized to a distribution with a mean of 500 and a standard deviation of 100)
Term
Percentile Rank
Definition
Percentage score: Number of items out of a total number (e.g., 93 out of 100 = 93 percent). Percentile Rank: Percentage of scores in a distribution falling below a particular raw score (e.g., 84 percent of a distribution is below the 84th percentile) 1. Uniform distribution  a. equal number of values are to be expected for any given Percentile Rank 2. Changes in scores in the middle of a distribution (where most values are "clumped") are associated with larger changes in Percentile Rank than at extreme ends  a. Percentile Rank increases from the 50th to the 84th percentile when going from the mean to one standard deviation above the mean  b. Percentile Rank increases from the 84th to the 98th percentile when going from one to two standard deviations above the mean 3. Can determine Percentile Rank from standard score and vice-versa  a. 84 percent of a distribution falls below a z-score of 1 or a T-score of 60 (i.e., one standard deviation above the mean); as such, either of those two standard scores may also be referred to as the 84th percentile b. 84th percentile falls one standard deviation above the mean; as such, one may simply convert to a standard score (e.g., z-score = 1, T-score = 60)
Term
Frequency Distribution
Definition
Frequency Distribution: Graph presenting variable values on x (horizontal) axis and frequency of those values on y (vertical) axis; takes various shapes 1. Normal distribution a. bell-shaped  b. unimodal (one peak) c. symmetric (mirror images to either side of mean) d. mean = median = mode e. 68 percent of values within one standard deviation of mean, 95 percent within two SDs, >99 percent within three SDs 2. Skewed  a. values bunched on one end, tapering off to the other side b. positively skewed if trailing off to right or high side, negatively skewed if to the left or low side  c. mean pulled away from mode in direction of skew  d. avoid mean as measure of central tendency for skewed distributions 3. Bimodal a. two modes or peaks, one on either side of distribution 4. Uniform a. equal frequencies across distribution (i.e., a block) 5. J-shaped a. skewed, but without a tail on side of distribution with mode
Term
ANOVA (Analysis of Variance)
Definition
One-Way Analysis of Variance (ANOVA) 1. Tests for differences in one dependent variable (DV) across multiple levels (i.e., conditions) of one independent variable (IV) 2. Omnibus null hypothesis: all of the means are equal (IV has no effect on DV) 3. If rejecting null, concluding: not all of the means are equal (IV has an effect on DV) 4. Does not state which means differ from one another (must run post-hoc tests for it) Assumes 1. Independence of observations 2. Homogeneity of variances normality 3. Relatively robust to violations of latter two F ratio 1. Variance between groups (error plus treatment) divided by variance within groups (error) 2. Mean square between (MSB)/mean square within (MSW) 3. Ratio of 1 signifies the lack of a treatment effect (i.e., with no treatment effect, no treatment variance; with no treatment variance, the F ratio is formed from error/error, yielding a value of 1) 4. If ratio is significantly larger than 1, may conclude means are farther apart than expected from sampling error: IV affects DV 5. If not, null hypothesis retained: variability of sample means the results of sampling error: IV does not affect DV Post-hoc tests 1. Significant ANOVA indicates mean differences exist; post-hoc tests tell specifically where the differences occur 2. Examination of which means different from which 3. Examples: Scheffe (conservative: provides more protection against Type I errors and increases Type II likelihood), Tukey's HSD (best for protection against Type I errors), Fisher's LSD (liberal) Factorial (or n-way) ANOVA 1. n represents number of IVs, or factors 2. Used when examining effects of two or more IVs 3. Example: 2x2 design with two IVs, each with two levels (e.g., sex: female/male; instruction method: novel/traditional)-main effects: difference across sexes, difference across instructional method-interaction: differing effect of one IV at different levels of the other (e.g., women achieve higher scores than men, but only with the novel method of instruction)-with an interaction, interpret main effects with caution (i.e., story more nuanced than a simple "women score higher than men")-if graphing means using separate lines for different levels of one IV, lack of an interaction is reflected in parallel lines Mixed-design ANOVA 1. Multiple IVs including both within-subjects (e.g., time) and between-subjects (e.g., condition) factors.  Example: pre-/post-tests with control condition Analysis of covariance (ANCOVA) 1. Covariate (extraneous variable) is used to account for a portion of the variation in the DV 2. Continuous 3. Must be measured prior to IV to ensure independence of treatment 4. Must be correlated with DV 5. Ultimate goal: reduce error variation Multiple analysis of variance (MANOVA) 1. Includes more than one DV 2. If obtaining significant effect, usually follow with univariate ANOVAs for each DV to interpret 3. Advantages: protects against inflated alpha from numerous ANOVAs; with multiple DVs, might reveal effects not noted by separate ANOVAs 4. Disadvantages: more complicated design; ambiguous as to which IV affects which DV; increases in power perhaps offset by loss in degrees of freedom Multiple analysis of covariance (MANCOVA) 1. As with an ANCOVA, one or more covariates are added in order to reduce error variation Discriminate Function Analysis 1. Used to classify individuals into groups based on variables such as: age, sex, and level of education.
Term
Variability
Definition
Variability (or spread) measures 1) Range: difference between highest and lowest values,affected by extreme values/outliers 2) Variance: sum of squared deviations from the mean, divided by N-1 less susceptible to extreme values/outliers 3) Standard deviation: square root of variance Measures of variation accounted for 1) r-squared (single predictor), R-squared (multiple predictors)proportion of variation accounted for in one variable through linear relationship with another (or others) 2) Eta-squared proportion of variation accounted for in one variable through relationship (not necessarily linear) with another (or others) 3) Squared factor loading proportion of variation accounted for in one variable by a factor
Term
Cluster Sampling
Definition
Cluster Sampling: Sampling technique involving naturally occurring groups (clusters) 1. Population divided into clusters and some clusters randomly selected for inclusion in the sample 2. Study information collected from all elements within clusters included in the sample 3. Clusters should be internally heterogeneous yet relatively homogeneous among themselves (i.e., variation should be more within rather than between clusters) 4. Differences between cluster and stratified sampling Stratified sampling: sample drawn from each stratum, main objective is improved precision Cluster sampling: only elements of randomly selected clusters are studied, main objectives are to improve sampling efficiency and reduce cost Multistage sampling: more complex form of cluster sampling Population divided into strata at highest level, then sample is drawn and stratified at lower level; procedure repeated until at lowest hierarchical level
Term
Random Assignment & Random Selection
Definition
Random Selection: Drawing a sample from a population in such a way that each member has an equal probability of being selected 1) Supports external validity - findings from a sample representative of a population more generalizable Random Assignment: Assigning participants to experimental condition in such a way that there is an equal chance of appearing in any given condition (e.g., flip a coin to determine assignment to control or experimental condition) 1) Supports internal validity - renders conditions equivalent (i.e., similar), meaning independent variable should be the only factor varying among conditions 2) If random assignment not possible due to pre-existing groups (e.g., classrooms, schools), quasi-experiment is in order
Term
Chi-Square
Definition
Chi-Square test: Examines frequency distribution of categorical variables such as political party affiliation or eye color 1. Non-parametric test that does not require normality 2. Goodness-of-fit: one-way Chi-Square test for examining frequency distribution of one independent variable a. May use expected frequencies given by even split of sample sizeNacross categories (e.g., with four categories, expect five observations in each given 20 people) b. May also use expected frequencies derived from knowledge of a comparison population (e.g., working on the assumption of 50 percent of people having brown eyes, 30 percent blue eyes, and 20 percent green or other, one would expect a sample of 50 people to produce 25 people with brown eyes, 15 with blue eyes, and 10 with green or other-color eyes) 3. Test for independence: Two-way Chi-Square test for examining contingency table for two variables to determine whether they are independent (unrelated) a. If a relationship does exist, frequency distribution of one variable will depend upon the other (e.g., more ill people among those exposed to a certain risk factor) b. Expected values determined by multiplying column total (e.g., number of ill people) by row total (e.g., number of people exposed to risk factor), then dividing by the total number of people observed (N) c. Assumes independent observations; each person appears once and only once in the table (e.g., a blue-eyed person would be counted once in the category "blue eyes," whereas an ill person who had been exposed to a risk factor would be counted only in the cell formed by the intersection of the ill and exposed categories) d. Requires counts, not percentages e. Requires expected counts of at least five 4. Degrees of freedom: estimate of the number of categories used to organize data.
Term
Moderating & Mediating Variables
Definition
Moderating Variable: A variable that affects the magnitude or direction of the relationship between the independent variable and the dependent variable 1. With correlations, a moderator is a third variable that affects the correlation between the independent variable and the dependent variable (e.g., a therapy has greater effect on depression as age increases) 2. With ANOVAs, the interaction between a moderator and the independent variable affects the dependent variable (e.g., a therapy has an effect on depression, but only among men) Mediating Variable: A variable explaining the process by which the independent variable affects the dependent variable (e.g., a therapy affects depression by creating a more positive self-image, which then lessens depression)
Term
Pooled Variance
Definition
Pooled Variance: Weighted average of two sample variances 1) Each sample variance multiplied by sample size minus one, then added to create weighted variance 2) Weighted variance divided by total sample size minus two (degrees of freedom for independent samples t-test) to obtain Pooled Variance 3) Assumes equal population variances 4) Provides better estimate of population variance than either sample variance alone 5) Each sample variance may be expected to differ from the corresponding population variance 6) By assuming equal population variances, both sample variances cast votes on (i.e., estimate) the population variance, then a weighted average of both obtains a pooled estimate based on more information than either sample possessed alone
Term
Variance
Definition
1. A measure of variability 2. The square of the standard deviation 3. Larger for wider distributions, smaller with tighter spreads 4. With a sample sizeN, (N-1) in denominator of sample Variance in order to correct for bias
Term
Statistical Significance
Definition
Statistical Significance: Results of an analysis reflecting more than chance variation, concluding an effect (relationship, mean difference) most likely exists 1)p-value is conditional probability of obtaining resultsifnull hypothesis of no effect is true 2)p-value compared to conventionally determinedalphato determine statistical significance alpha is the maximum likelihood of making a Type I error (mistakenly declaring an effect) that will be accepted Ifp>alpha, results too likely to be due to chance, and no effect is concluded Ifpis less than or equal toalpha, results relatively unlikely to be attributable to chance alone, and the existence of an effect is concluded 3) Example: Does our test prep course affect math SAT scores? alternative, or research, hypothesis is that there is an effect; the mean of those taking the prep course is not equal to 500 null hypothesis is that there is not an effect; the mean of those taking the prep course is the same as those not taking it, or 500 entertain null hypothesis for a moment: if the null were true, what is the probability (p-value) of having obtained results at least as extreme as those we obtained? If p-value (e.g., .34) is greater thanalpha(e.g., .05), obtained results not remarkable enough to conclude difference attributable to anything more than chance variation If p-value (e.g., .034) is less than or equal toalpha(e.g., .05), obtained results are remarkable enough to conclude difference is more than just a matter of chance variation (i.e., there is an effect) 4) Type I error: Conclude an effect exists when actually obtained results were due to chance alone. probability of committing Type I error is termedalpha 5) Type II: Conclude no effect exists when actually there is an effect probability of committing Type II error is termedbeta 6) Power, or probability of concluding an effect exists given that it does is (1 –beta), can be increased by: increasing effect magnitude. decreasing error variation. increasing sample size. increasingalpha(i.e., run less conservative test). power analysis: calculate the sample size required to capture an effect. running one-tailed, or directional, test (i.e., focus specifically on higher or lower) rather than two-tailed, or nondirectional, test looking more broadly for any effect 7) Familywisealpha running multiple tests, each with its ownalpha, results in a familywisealphafor the entire set approximately equal to the sum of all thealphas example: running five t-tests, each with analphaof .05, yields a familywisealphaof .05 times 5, or .25 may correct (e.g., Bonferroni-adjustment) or run alternative test (e.g., MANOVA to replace multiple ANOVAs)
Term
The t-test
Definition
The t-test determines whether one mean equals a hypothesized value or two means are equal 1) The t statistic calculated by dividing the difference between one sample mean and a hypothesized value or the difference between two sample means by the standard error of that particular difference statistic If the t statistic is large enough, one may declare the difference to be significant (the mean is different from the hypothesized value, or the two means are different) 1) The t-test is more powerful (i.e., more likely to reject the null hypothesis) with: larger sample size(s) larger mean difference smaller sample variation Specific t-tests 1) One-sample t-test: tests the hypothesis that a single sample mean is different than a specific hypothesized value 2) Independent-samples t-test: tests the hypothesis that two unrelated samples are different from each other3) Related or dependent-samples t-test: tests the hypothesis that the difference between two related samples (e.g., pre-/post-scores, scores of siblings) is not equal to 0 (i.e., samples have different means)
Term
Sample Size
Definition
Sample: A subset of a population meant to provide a small snapshot representing the whole body 1) Random sampling (i.e., every member of a population has an equal probability of being selected) affords best opportunity of obtaining representative samples (i.e., samples matching the characteristics of the population) Sample size: Number of observations (e.g., people) in a sample, denoted by N. Larger sample sizes are preferred in order to: 1) Ensure sample adequately represents population example: if sampling two people from a neighborhood with an equal mix of women and men, one is just as likely to grab two women or two men as to grab one of each if sampling 25 people from a neighborhood, far less likely to obtain a sample entirely made up of women or men 2) Reduce sampling error more representative samples provide statistics closer to the corresponding population parameter 3) Increase statistical power with less error (i.e., noise), differences or relationships are more likely to be deemed statistically significant
Term
Central Limit Theorem
Definition
The Central Limit Theorem: increasing the size of random sample N drawn from a population will cause the distribution of the sample means to: 1. Form a more normal distribution 2. With a mean equal to the population mean 3. And a standard deviation (i.e., the standard error of the mean) equal to the sample standard deviation divided by the square root of N
Term
Correlation
Definition
Correlational (or observational) study: Examines relationship between unmanipulated variables 1) Measures association; does not establish cause and effect x may cause y, y may cause x, or a third variable z drives both 2) Pearson’sr: measure of linear relationship between two variables ranges from -1 (perfect negative relationship) through 0 (no relationship) to 1 (perfect positive relationship) 3) Assumptions independent observations. linear relationship. bivariate normality (joint distribution of both variables together normal) 4) Graphically presented with a scatterplot 5) Weakened by restriction of range 6) Susceptible to bivariate outliers (observations far from the mean of both variables) 7) Suppressor variable may hide Correlation 8) Partial Correlation: Correlation between x and y after removing variation in each shared with a third variable z 9) Semi-partial Correlation: Correlation between x and y after removing variation in x (and only x) shared with a third variable z 10) Coefficient of determination: r-squared proportion of variation in one variable shared with another 11) Other Correlation coefficients point-biserial: one continuous variable, one dichotomous variable possessing only two values (e.g., female/male) -- alternative or supplement to t-test. biserial: two continuous variables, one made into an artificial dichotomous variable (dichotomized). phi: two dichotomous variables. tetrachoric: two artificial dichotomous (dichotomized) variables. contingency: two nominal variables. Spearman'srho: two ordinal variables. eta: nonlinear relationships between two variables. canonical: two sets of variables, one representing multiple independent variables, the other multiple dependent variables; examines many-to-many rather than one-to-one or many–to-one relationships; produces multiple canonical Correlation coefficients (the first accounting for the largest portion of the relationship). Correlation coefficients (the first accounting for the largest portion of the relationship). Logistic regression is used to predict discrete outcomes.
Term
Autocorrelation
Definition
1. Relationship between two values of the same variable measured at different times 2. Correlation between measurements of a dependent variable taken at different times from the same subject(s) 3. In regression analysis (analysis of the relationship between the independent variable and the dependent variable), it tends to underestimate error terms, inflating t-values and reducing p-values (i.e., so the results are more likely to be deemed statistically significant) 4. If the researcher can determine the prediction error in one observation, then a good guess about the error in a similarly linked observation can be made 5. Determining those links also determines Autocorrelation and thus allows more information to be extracted from the data 6. In a time-series design, Autocorrelation serves two purposes: a. To detect non-randomness in the data Iactual significance) b. To identify an appropriate time-series model if the data are not random 7. Consistent relationships in time-series data can be used to predict future values in the series - some ability to forecast itself
Term
Effect Size
Definition
Effect-Size measures determine practical rather than statistical significance: "Is effect large enough to matter?" versus "Does effect exist?" 1. Used in meta-analyses to combine findings from multiple research studies due to independence from sample sizes 2. Specific Effect-Size measure varies by context Correlation: r-squared: proportion of variation in one variable accounted for by the linear Relationship with another Chi-square: Cramer’s phi: strength of relationship between two variables in a contingency table t-test: Cohen’s d: difference between two group means in terms of a standard deviation (control group or pooled) ANOVA: eta-squared,omega-squared: proportion of variation in the DV accounted for by the IV
Term
Confidence Interval
Definition
Confidence Interval: range of values centered at sample statistic used to estimate the population parameter with a confidence of (1 – α) percent 1) If 100 Confidence Intervals for a population mean were created from 100 samples, 95 would be expected to contain the population mean when using an α of .05 2) Example: To create 95 percent Confidence Interval for sample mean center at sample mean. add/subtract from that point estimate the critical values of the test statistic multiplied by the standard error of the sample statistic. sample mean = 20, standard error = 0.1, using criticalz-values (rounded for simplicity) given a large sample size and an α of .05 20 + 2 * 0.1 = 20.2. 20 – 2 * 0.1 = 19.8. 95 percent Confidence Interval ranges from 19.8 to 20.2 3) Sample has produced an estimate of population mean that may be used instead of, or in addition to, hypothesis testing (i.e., determining whether the population mean is a particular value)
Term
Coefficient of Determination (R-Squared)
Definition
Coefficient of (Multiple) Determination, R-Squared 1) Expresses the proportion of variation in a dependent variable accounted for by the independent variable(s) 2) Reflects the reduction in error achieved by using one or more independent variables in predicting the dependent variable as compared to using only the dependent variable's mean in making estimates 3) Not good for sample-to-sample comparison because of different variances in the dependent variable with each sample (i.e., differing values of total variation) 4) Higher with more variables corrected with adjusted R-Squared: more conservative, lower estimates of the variation in the dependent variable are accounted for 5) No matter how high the R-Squared value, correlations are a matter of association, not necessarily causation
Term
Proband
Definition
Proband (aka Patient Zero, or index case): The first family member to seek professional attention for a disorder 1) Family research method focuses on patterns of disorder among Probands and relatives 2) Probands share a known number of genes with family members of various relationship if genetic predisposition for disorder is found, then relatives of Probands will have the disorder at higher rate than the general population (i.e., concordance between shared genes and disorder) example: 1 percent of general population diagnosed with schizophrenia; 10 percent of first-degree relatives of Probands diagnosed with schizophrenia
Term
Actuarial Data
Definition
Definition: prediction based on statistical information (objective) rather than judgment (subjective) Key researchers: Grove and Meehl Compared actuarial data to clinical judgment Related to risk calculation of births and deaths Predictions from actuarial data are equal to or better than predictions from clinical judgments; clinical judgments rarely outperformed actuarial data Meta-analysis: of 136 studies, eight favored clinical judgment, 64 favored actuarial, 64 equivalent Later meta-analyses: effect sizes for actuarial data 10 percent better than clinical judgment; actuarial predictions of delinquent and criminal behaviors more accurate than clinical judgments Ruled out alternative explanations such as examiner's field of training, length of experience and task-related experience
Term
Protocol Analysis
Definition
Protocol Analysis: Qualitative data analysis method involving verbalization of thoughts occurring while completing a given task 1) Assumes verbally expressing thoughts does not alter the sequence of thoughts required to perform a task 2) Obtained reports (protocols) are analyzed to gain an understanding of how participants solve problems 3) Other possible indicators in addition to verbal reports: reaction times. error rates. brain activation patterns. eye fixation sequences 4) High correspondence between thoughts and information/objects examined 5) One of the principal research methods in cognitive psychology, cognitive science, and behavior analysis
Term
Data Reduction Techniques
Definition
Data Reduction Techniques: methods by which the interrelationships between a set of variables are analyzed to produce a smaller number of dimensions (or factors) 1) Principal component analysis: analyzes all the variation in a set of observed variables to produce a smaller number of components (or factors). factor loading is the correlation between an observed variable and a given factor. eigenvalue is the amount of variation in the observed variables accounted for by a given factor. eigenvalue is the sum of the squared factor loadings for a given factor. preferred over principal axis factoring for data reduction: PAF is similar to PCA, but removes variation unique to the individual observed variables and instead analyzes only common variation in producing factors; it is preferred over PCA for finding underlying structure 2) Cluster analysis: "eyeballs" groups (or clusters) from correlation matrix two most highly correlated variables form nucleus of first cluster. other variables correlated with nucleus added to cluster. two highly correlated variables with low correlations for first cluster form nucleus for second cluster. variables correlated with second nucleus added to second cluster (and so on). 3) Latent trait analysis: form of factor analysis for categorical data frequently used in educational testing and psychological measurement. categorical variables (e.g., responses to a multiple-choice exam such as the SAT) are reduced to latent traits (e.g., academic ability)
Term
Developmental Research
Definition
Developmental Research assesses changes over a period of time and consists of three designs 1) Longitudinal: one group is followed for an extended period of time (e.g., first-graders tracked for 12 years) lengthy time requirement a drawback. subject mortality due to a number of factors (illness, relocation, etc.) a concern. lack of randomization another potential issue. history primary threat to external validity. can provide valuable qualitative and quantitative data 2) Cross-sectional: groups at each level are measured at the same time (e.g., 12 groups of students, one at each grade level, are observed at one time) assumes differences reflect natural development (i.e., a longitudinal study would produce similar findings). requires much less time than a longitudinal study. differences may be due to cohort effect (i.e., group differences reflect different experiences rather than natural development) 3) Cross-sequential: combination of longitudinal and cross-sectional designs different groups assessed repeatedly over time (e.g., groups of first,fourth-, seventh-, and 10th grade students are measured over threeyears, thus providing coverage from first through 12th grade). reduces time required to perform and minimizes assumptions/cohort effects
Term
ABAB Design
Definition
ABAB Design: a single-subject design in which a baseline measure of the dependent variable (e.g., depression) is obtained (A) before treatment is introduced (B), removed (A), and reintroduced (B) 1. If treatment has an effect, then the dependent variable will: deviate from baseline when it is introduced. return to baseline when removed. deviate again from baseline when reintroduced 2. If dependent variable does not return to baseline after removal of treatment, initial deviation may be attributable to a confounding variable 3. Treatment is reintroduced after second baseline in order to: further establish treatment effect. restore benefit of treatment to subject 4. ABA design common in basic research does not reintroduce treatment after second baseline 5. ABAB design may be altered to include different treatment: following baseline and treatment for therapy 1 (A1B1), study may be continued with a subsequent baseline and treatment series for therapy 2 (A2B2) and so on
Term
Double-Blind Design
Definition
Double-Blind Design: Participants and experimenters are blind (i.e., naïve) to experimental condition 1. Experimenters may unconsciously influence participant behavior to fit research hypotheses (expectancy effects) 2. Subtle cues such as tone of voice or posture may bias research results to fit expectations 3. An experimenter ignorant of experimental condition should be unable to influence in a manner fitting expectations 4. Variations single-blind study: participant is naïve to condition, experimenter is not. triple-blind study: in addition to participants and experimenter, others involved in the research (e.g., pharmacists, statisticians) are naïve as well
Term
Latin-Square Design
Definition
Latin-Square Design: counterbalancing technique used for multiple-treatment designs when order effects are possible and there are too many treatments for complete counterbalancing 1) Create a reduced number of treatment sequences equal to the number of different treatments (e.g., with four treatments: ABCD, DABC, CDAB, and BCDA) from the set of all possible treatment sequences. each treatment appears in each serial position only once. 2) Randomly assign participants to these treatment sequences 3) With an even number of treatments, a balanced Latin-Square Design may be used. each treatment appears before and after other treatments an equal number of times (e.g., treatment A appears twice before, and twice after, B) 4) May then analyze treatment and order effects 5) Does not completely control for order (not all possible treatment sequences used)
Supporting users have an ad free experience!