Term
The goal of factor analysis |
|
Definition
to summarize the interrelationships among variables (items) in a concise but accurate manner as an aid in conceptualization |
|
|
Term
What does factor analysis do? |
|
Definition
It analyzes the relationships among a set of variables across a set of individuals, and assumes there are more important, but hidden, variables (factors) causing these relationships |
|
|
Term
|
Definition
an area of generalization that is qualitatively distinct from an area represented by any other factor |
|
|
Term
|
Definition
areas that are qualitatively different where relatively little generalization can be made from one area to another |
|
|
Term
What does factor analysis allow us to do? |
|
Definition
Factor analysis allows finite humans to analyze numerous variables at a time, to unravel relationships among variables correlated in highly complex ways, to report graduated relationships of variables to factors, and to arrive at parsimonious solutions |
|
|
Term
Constructs built on factor analysis are.... |
|
Definition
conceptually clearer than the a priori ideas, and these constructs are integrated through the development of theories; these theories are generally interested in summarizing major variations in the data, and minor variations are generally excluded |
|
|
Term
|
Definition
demonstration problems that check the validity and robustness of factor analysis; they are often from the physical world (Thurstone’s box problem) where the factor structure is universally known |
|
|
Term
|
Definition
• Search data for possible qualitative and quantitative distinctions, this is particularly useful when the sheer amount of available data exceeds comprehensibility (exploratory factor analysis)
• Test hypotheses regarding certain qualitative and/or quantitative distinctions in the data (confirmatory factor analysis)
• Analyze items for scale/subscale development
• Minimize the number of variables for further research, while maximizing the amount of information in the analysis; reduces the set of variables to the most important ones
• Orthogonalize (force to be uncorrelated) variables, typically the independent variables in a multivariate analysis, making the results easier to interpret |
|
|
Term
Paradigms in Factor Analysis |
|
Definition
these operate as philosophies guiding a person's approach to factor analysis
Mathmatical Scientific Factors are inventions Factors are real |
|
|
Term
|
Definition
the stress is upon the derivation and exact computation of all the procedures; procedures which only give estimates are much less valued than procedures which can be exactly calculated |
|
|
Term
|
Definition
concerned with factor analyzing data in order to build and test scientific theories; more concerned about the congruence of the results with future factor analyses than whether or not the procedures can be derived or calculated directly, and estimates which are consistently given across several sets of data may be more highly valued than an exact computation which does not generalize as well |
|
|
Term
|
Definition
all scientific constructs, including factors, are human inventions to help us organize our thoughts about data
• We cannot know underlying realities, but only their surface manifestations. • We integrate these in many ways; since all of these integrations are philosophically equal, we are free to pick the one that is the simplest for us to understand and use |
|
|
Term
|
Definition
factors are expressions of underlying realities with the factors waiting to be discovered
• Cattell notes that in our data it is not appropriate to say that a person’s response to one item influences their response to another item • There must be a third, latent (but real), variable that is influencing them to answer two items the same way |
|
|
Term
|
Definition
data that has good content validity can be designed so that it determines the outcome rather than the investigator, and if the procedures are applied objectively, the results will be independent of the investigator and the particular variables |
|
|
Term
Those who hold that factors are only human constructs tend to.... |
|
Definition
emphasize the mathematical approach |
|
|
Term
those who emphasize that factors are real... |
|
Definition
emphasize the scientific approach because these are the most replicable |
|
|
Term
Criteria for Choosing in Factor Analytic Decisions |
|
Definition
Generalizability Simplicity |
|
|
Term
|
Definition
o Robustness across random samplings of subjects or variables (replication), and invariance across theoretically relevant samples of both subjects (i.e., Gender) and variables (i.e., multi-method measures)
o The prime function of science is to find such generalizability and to develop theories that interrelate and explain it |
|
|
Term
Simplicity (parsimony or elegance) |
|
Definition
o Whenever there are two explanations for a given phenomenon, the simpler of the two is accepted o UNLESS the more complex one has a clear advantage in terms of explaining more of the data |
|
|
Term
DECISION POINTS IN EXPLORATORY FACTOR ANALYSIS |
|
Definition
Number of Factors Variables Subjects Analysis Matrix Factor Extraction Methods Factor loading Communalities Factor Rotation |
|
|
Term
|
Definition
Usually the first question in the factor analyst’s mind
The most subjective decision, but very important • When too few factors are extracted, they are likely to include considerable error • Too many factors is not as serious of a problem, but over extraction of more than a couple of factors also leads to error • When in doubt, extract an extra factor and then see if it generalizes in new studies |
|
|
Term
|
Definition
• Variables with good reliabilities and high interrelationships are best, but factor analysis will also work with items • It is best to have at least 5 variables/factor • You can often get away with as few as 3 variables/factor |
|
|
Term
|
Definition
• It is best to have at least 20 subjects/variable • You can often get away with as few as 10 subjects/variable • 100 total subjects is the minimum
However, IF the variables have good reliability AND there are low correlations between the factors, then even as few as 2 variables/factor and 5 subjects/variable are OK |
|
|
Term
|
Definition
Only product moment correlations (SPSS default) necessarily give a legitimate correlation matrix, and these are the only correlation coefficients recommended for exploratory factor analysis; do not use the covariance matrix |
|
|
Term
Factor Extraction Methods |
|
Definition
• (Mathematical/Invented) Principal Components (SPSS default): a linear transformation of the data that perfectly reproduces the variance of each variable across all components o Assumes perfect measurement, no error term. o Easier for pre-computer calculations. o Only use with variables having strong reliability. o Especially bad for items. • (Scientific/Real) Common factors (principal axes): a linear combination of variables that reproduces the significant variance of the matrix and assumes leftover variance is trivial and can be ignored o Assumes that the variables are fallible, has error term. o Works well on items and practically all variables. o Much preferred.
The standard practice in principal component analysis is to not keep all of the components for analysis, so Mathmatical/Inventions researchers also end up ignoring variance. |
|
|
Term
|
Definition
the measure of the degree of generalizability between each variable and each factor; the farther the factor loading is from 0, the more one can generalize from that factor to the variable. Loadings can be negative or positive. |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
The usual procedure in science (in relation to factor loading) is |
|
Definition
to maximize fidelity with the data and, if any error might be made, to make that error in the conservative direction; only common factors fit the scientific paradigm |
|
|
Term
Common factors are also best because |
|
Definition
they are more accurate with fewer than 40 variables; however, once you have more than 40 variables, the extraction method does not matter. |
|
|
Term
Communalities (the reliability of the variables) |
|
Definition
• (M/I) Components use 1.0. This made pre-computer calculations easier. In modern software, usually mathematical iterations are used to increase fit to the data, again a concession to the real-world lack of perfect reliability (SPSS default for components). • (S/R) Common factors (principal axes) typically use the variable's squared multiple correlation with all the other variables, and then use mathematical iterations to increase accuracy (SPSS default for principal axes). • Gorsuch strongly recommends using no more than 3 iterations because otherwise the communality estimates can become artificially inflated (SPSS default is 25). |
|
|
Term
|
Definition
Always rotate your factors if you have more than one; rotation appropriately spreads the variance among the extracted factors (SPSS default is no rotation). • (M/I) Orthogonal: forces uncorrelated factors; varimax is best because it produces the clearest factor structure. (Boyd prefers this for building scales with items; the summed item scales will then revert to their natural correlations) • (S/R) Oblique: factors allowed to be correlated; promax (kappa = 3) is best because it most closely reproduces the actual correlations between the variable scales (Boyd prefers this for exploratory factor analysis) |
|
|
Term
|
Definition
are the most conceptually clear and distinct |
|
|
Term
|
Definition
are the most realistic, generalize the best, and can be either correlated or uncorrelated depending upon the data
allow the possibility of a higher-order factor analysis |
|
|
Term
Higher order factor analysis |
|
Definition
make the primary factors into variables, and then factor these variables; always do a higher order factor analysis when there are substantial correlations among the oblique factors |
|
|
Term
How to interpret the higher order factor |
|
Definition
The correlation of the higher order factors with the original variables should then be computed to help interpret the higher order factor |
|
|
Term
Rotated Factor Structure (NOT Factor Pattern) Matrix |
|
Definition
the variables’ correlations with the rotated factors • By examining which variables load high with the factor and which load low, it is possible to draw some conclusions as to the nature of the factor • The factor can then be named |
|
|
Term
|
Definition
one that is sufficiently high to assume that a relationship exists between the variable and the factor; .3 or higher has come to be seen as the absolute lower bound for a salient loading, and many factor analysts use .4 |
|
|
Term
|
Definition
When a variable saliently loads on more than one factor |
|
|
Term
|
Definition
a measure of the variance accounted for by all possible factors; in principle components the sum of the eigenvalues equals the number of the extracted components (found in the Total Variance Explained SPSS output table) |
|
|
Term
|
Definition
the most popular criteria; the principal components eigenvalues are examined, and the researcher counts the number of those greater than one, and that number is the number of factors; also called the Kaiser criterion, and eigenvalues > 1 • NEVER USE as it is only accurate by accident of the variables/factors ratio and the variable intercorrelations • It is especially poor with items |
|
|
Term
|
Definition
various numbers of factors are extracted, with different rotations, and the researcher selects the one that makes the most theoretical sense • Better than roots > 1, which is accurate by chance • Subjective, susceptible to researcher bias |
|
|
Term
|
Definition
the principal component eigenvalues are plotted in order of size: • The correct number of factors coincides with the point on the eigenvalues plot where the downward descending curve straightens (or breaks) into a much gentler and even slope • This gentle even slope (the scree) represents trivial/random variance factors • Requires the use of a straight edge or ruler |
|
|
Term
|
Definition
1. The scree ≥ 3 points in a row 2. There can be a little variance of the scree points about the line, but not much (Boyd says all points should touch the line) 3. The slope of the scree must not approach the vertical 4. There is generally a relatively sharp, even if small, break in vertical level between the last point on the scree and the next point above |
|
|
Term
Boyd/Gorsuch Salient Loadings criterion: |
|
Definition
• More factors are extracted than will likely be kept (scree criterion + ~50% more will usually give a safe starting point) |
|
|
Term
Boyd/Gorsuch Salient Loadings criterion: • A SIGNIFICANT factor has: |
|
Definition
o At least 3 variables that load highest on it at .40 or greater o Alternatively, at least 2 variables that load highest on it at .50 or greater o Alternatively, at least 1 variable that loads highest on it at .60 or greater o In addition, none of the above loadings may have cross-loadings nearer than .10 |
|
|
Term
Boyd/Gorsuch Salient Loadings criterion: A TRIVIAL factor |
|
Definition
does NOT have at least 3 (≥ .40), 2 (≥ .50), or 1 (≥ .60) variable(s) that load highest on it without being negated by cross-loadings nearer than .10 |
|
|
Term
|
Definition
for a factor to be significant, its salient items must also form a scale with a reliability of ≥ .60, indicating it could likely be developed into a scale |
|
|
Term
When a factor has a single salient loading: |
|
Definition
it is called a singlet factor • The item can be viewed as one that is reliable enough to be a single item measure of the construct, but generally needs additional items to be developed into good measure • The scale-builder’s addendum for singlet factors: o Check the item’s extracted (not initial) communality o If it is ≥ .60, it meets the criterion o Recall that the communalities are the estimates of the items’ reliabilities in the factor analysis |
|
|
Term
Other criteria beyond the scope of Psychometric Foundations |
|
Definition
parallel analysis, minimum average partial correlation, and factor replication/invariance |
|
|
Term
Factor Scores (creating a factor variable or scale) |
|
Definition
• (M/I) Components: regression scores are calculated (SPSS default for components) • (S/R) Common factors: regression scores are estimated (SPSS default for principal axes); in terms of generalization to a new sample, factor score estimates clearly work as well or better than calculated factor scores • (S/R) Unit weights are actually better than regression weights unless your N is about 500 or more o They allow for less capitalization upon chance and are more stable, thus generalizing to new samples better o 1’s for items that load saliently, 0’s for items that don’t |
|
|
Term
|
Definition
will contain the natural correlations between the items, whether they are created from varimax or promax factor loading tables |
|
|
Term
|
Definition
scores with kappa = 3 tend to intercorrelate at levels similar to the unit weight scales (SPSS default kappa = 4, which tends to create correlations higher than the unit weight scales) |
|
|
Term
Varimax factor regression score |
|
Definition
variables will not be correlated |
|
|
Term
|
Definition
does the measure assess the intended attribute well? |
|
|
Term
|
Definition
measuring the psychological attribute adequately; the overall validity |
|
|
Term
|
Definition
the measure shares a statistical relationship with a particular criterion |
|
|
Term
|
Definition
the items are a representative sampling from the pool of required content |
|
|
Term
Internal structure (factorial) validity |
|
Definition
the factor structure matches the theory |
|
|
Term
Response processes validity |
|
Definition
the process the respondents use when filling out the measure matches what the psychologist thinks they are using |
|
|
Term
Trait/state change validity |
|
Definition
a trait measure is stable, a state measure is changeable |
|
|
Term
|
Definition
the measure is used in an ethically-appropriate manner |
|
|
Term
|
Definition
intuition, theory, and data; it is a matter of degree rather than an all or nothing property, and is an unending process |
|
|
Term
|
Definition
The adequacy with which a specified domain of content is sampled |
|
|
Term
Content validity rests upon |
|
Definition
a theoretical appeal to the correctness of content and the way it is presented |
|
|
Term
|
Definition
the extent to which one can generalize from a particular collection of items to all possible items in the construct |
|
|
Term
|
Definition
the extent to which the test taker, or a casual observer, feels the instrument measures what it is intended to measure; this is after an instrument is constructed, and has only minor relevance to content validity (done before construction)
Often underestimated; probably the most important problematic aspect of IQ tests |
|
|
Term
|
Definition
Using an instrument to estimate some criterion behavior that is external to the measuring instrument itself, the size of the correlation directly indicates the predictive validity |
|
|
Term
Predictive validity is determined by |
|
Definition
the correlation between predictor and criterion |
|
|
Term
connection between predictor and criterion |
|
Definition
Courts of law are increasingly demanding some logical connection between predictor and criterion because of the issue raised of cultural bias; however, no amount of theory can substitute for a lack of correlation between predictor and criterion |
|
|
Term
|
Definition
something may occur to eliminate or minimize relevant differences on the predictor or criterion |
|
|
Term
A test may have good ______ validity but not have good _____ validity |
|
Definition
|
|
Term
Predictive validity coefficient |
|
Definition
the correlation between the predictor test and the criterion variable, which specifies the degree of validity of that generalization |
|
|
Term
Predictive validity coefficients based on |
|
Definition
a single predictor rarely exceed .3 to.4; people are too complex to permit a highly accurate estimation of their proficiency in most performance related situations, and it's actually remarkable that predictor tests correlate as highly with criteria as they do |
|
|
Term
Things affecting predictive validity |
|
Definition
• Reliability of measures creating attenuated correlations • Restriction of range, often on criterion • Proportion of dichotomous criterion variable, or having only a few target cases • Time: correlations decrease as time elapses between the measures of the predictor and the criterion. • Types of events: single events difficult to predict, aggregated events easier |
|
|
Term
Predictive validity for diagnostic tests |
|
Definition
Sensitivity Specificity • Measures are often better at one than the other o A highly sensitive measure, when negative, rules OUT disorder o A highly specific measure, when positive, rules IN a disorder |
|
|
Term
concurrent vs predictive validity |
|
Definition
Concurrent validity is not the same as predictive validity because there is no future prediction of a criterion; in concurrent validity the correlation is between the predictor test and another measure of the same construct in the same data collection |
|
|
Term
|
Definition
something that psychologists theorize as existing that does not exist as an observable dimension of behavior; it is abstract and latent (hidden) rather than concrete and observable |
|
|
Term
Three aspects of construct validation |
|
Definition
• Specifying the domain of observables related to the construct (outlining the construct) • Determining the extent to which observables tend to measure the same thing, several different things, or many different things from empirical research and statistical analyses • Performing subsequent individual differences studies and/or experiments to determine the extent to which supposed measures of the construct are consistent with "best guesses" about the construct |
|
|
Term
The adequacy of a construct's outline is tested by |
|
Definition
determining how well the measures of observables "go together" empirically (intercorrelate); if you are investigating one thing, you want your items to vary in the same pattern |
|
|
Term
|
Definition
is necessary, but not sufficient, for construct validity; Cronbach's alpha is our best measure of internal consistency among items |
|
|
Term
|
Definition
The degree to which to measures are affected similarly by a variety of experimental treatments defines their similarity; when a variety of measures behave similarly over a variety of experimental treatments, it becomes meaningful to speak of them as measuring the same construct involves both theory and correlations |
|
|
Term
Factor analytic/reliability process of construct validity |
|
Definition
• If all the proposed measures or items correlate highly with one another, it can be concluded that they all measure the same thing; keep investigating the construct • If the measures or items tend to split up into clusters such that the members of a cluster correlate highly with one another and correlate much less with the members of other clusters, they measuring number of different things; split the construct • A third possibility is that the correlations among the measures or items are all near zero, so that they measure different things and there is no meaningful construct unity; consider a new construct |
|
|
Term
The danger of construct validity |
|
Definition
is circular logic 1. Tests 1 and 2 correlate (a validity coefficient) 2. According to widely accepted theory, Constructs A and B correlate 3. Test 1 is a measure of Construct A 4. Test 2 is a measure of Construct B Only Hypothesis 1 can be tested directly; it is necessary to infer the truth of the other hypotheses from this test; the paradigm for determining construct validity is invalid from an inductive standpoint, it cannot be completely proven with data |
|
|
Term
|
Definition
Constructs should correlate highly with similar constructs, but it is not automatic that the construct name that motivated the research is appropriate; these correlations are validity coefficients, sometimes called concurrent validity What is done in practice is to assume that at least two of hypotheses 2 through 4 are correct; an empirical test of hypothesis 1 then allows for a valid inference about the remaining hypothesis
Alternatively, Tests 1 and 2 may have both been designed to measure Construct A |
|
|
Term
The danger of circular logic can be lessened by |
|
Definition
• Restricting investigations of construct validity to those situations in which some of the hypotheses are very probable • Correlating supposed measures of constructs where the domain of one has previously been both well defined and highly restricted |
|
|
Term
Campbell and Fiske's Construct Validation: Multitrait-Multimethod Matrix |
|
Definition
Reliability and validity as points along a continuum rather than as sharply distinguished ideas, since each involves degrees of agreement between measures • Validation is typically convergent because it is concerned with demonstrating that two independent methods of inferring an attribute lead to similar ends • A measure should have divergent validity in the sense of measuring something different from existing measures, and should not correlate to an extremely high degree with other measures, especially of different constructs • A measure is defined by both the content of its measured attribute (what it's measuring), and its method (how it's measuring it) • At least two attributes, each measured by at least two methods, are required (a minimum of 4 measures) |
|
|
Term
|
Definition
the extent to which a measure is internally consistent. o F & B call these the monotrait-monomethod correlations o In parentheses |
|
|
Term
|
Definition
the relationship between two measures of the same attribute using different methods. o F & B call these the monotrait-heteromethod correlations o In bold |
|
|
Term
|
Definition
the relationship between two measures that share a common method but assess different attributes o F & B call these the heterotrait-monomethod correlations o In the solid-line triangles |
|
|
Term
|
Definition
the relationship between different attributes using different methods o F & B call these the heterotrait-heteromethod correlations o In the dashed-line triangles |
|
|
Term
reliability vs trait vs method vs neither correlations |
|
Definition
One expects the reliability correlations to have the highest values, and the “neither” correlations to have the lowest; construct validation demands that trait correlations be relatively high to reflect convergent validity, and that method correlations are relatively low to reflect divergent validity |
|
|
Term
|
Definition
treating a term as if it denotes a real entity or process; this has caused many problems in science |
|
|
Term
interpretation of constructs |
|
Definition
It would be good if words denoting constructs were altered as evidence is obtained about the relevant sets of observables, but this is unfortunately not done as frequently as it should be |
|
|
Term
Nunnally & Bernstein’s commonsense point of view of interpretation of constructs |
|
Definition
• If a measuring instrument produces interesting findings over the course of numerous investigations, and fits the construct name applied to it: o Continue using the instrument. o Continue using the construct name referring to it. • If the resulting evidence is poor, ask: o Is instrument is worth fixing? o Does the instrument really fit the name of the attribute used to describe it? • If it is not possible to find sets of variables or items that relate to the construct, the construct itself should be questioned |
|
|