Term
|
Definition
The items in the scale appear to be valid on the “face of it”, they seem to be relevant to the phenomena of interest, scale items make sense “on the surface |
|
|
Term
Content Validity (AKA Object Validity) |
|
Definition
Extent to which items adequately reflect/represent all relevant facets of a concept, need to know or pre-specify content domain, items should cover all concepts defined in the conceptual definition, used especially for skill based tests (achievement, SAT, knowledge) and for assessments in the workplace. |
|
|
Term
|
Definition
The key to criterion validity is (of course) a clear and unambiguous criterion, preferably one that is unimpeachable. The higher the correlation the better, the main requirement being statistical significance. Criterion validity is often (but not invariably) predictive. Criterion validity is typically atheoretical. Usually criterion validity reflects a test’s ability to have practical value as a tool for classification. Some examples: Employment test tracks status (Fired vs. Promoted), SAT score tracks college status or grade, Measure of health status |
|
|
Term
|
Definition
he degree to which a test measures the theoretical construct (concept with a proven track record in the literature) that it is designed to measure. Evidence is developed from multiple relationships with other variables, not just a single variable or a single theory. Evaluates how well the test fits into a pre-specified network of theories, aka a nomological network.To validate a construct one must specify what other known measures should, and should not, be correlated with the test or measure. Addresses the measure’s meaningfulness, its connection to other theories, and its relationship to other variables, other tests, and other measures. |
|
|
Term
|
Definition
The test of interest should differentiate between sub-groups within a relevant category. Note the relation between discriminant validity and both construct validity and criterion validity. Different measures should make important distinctions OR a single measure should allow groups to be classified appropriately (e.g., with a Duncan Range Test or similar tool.) |
|
|
Term
|
Definition
Different measures of the same construct should be correlated, and members identified by related tests should overlap. |
|
|
Term
Threats to Internal Validity |
|
Definition
History, Maturation, Testing effects, Instrumentation, Reaction measures, Selection of participants, Sample attrition, Regression effects, Compensatory rivalry, Resentful demoralization |
|
|
Term
|
Definition
Nominal, Ordinal, Interval, Ratio |
|
|
Term
Test-Retest (Temporal) Reliability |
|
Definition
someone is given a measure and then it is readministered at a later time. Use correlations to test how well the two measures are related. It is important to consider the length of time between measurements (too short and people might remember and put the same response, too long and changes may have occurred that impact their response) |
|
|
Term
Split-half Reliability Coefficient |
|
Definition
cut the measure in half and see how similar the responses are in both halves using a correlation. Measures can be split by even vs. odd numbered items or 1st vs. 2nd half |
|
|
Term
|
Definition
correlate each item to the entire scale excluding that item and then obtain the average item/scale correlation. Floor is 0.7 in peer-reviewed journals |
|
|
Term
Alternate Forms Reliability |
|
Definition
develop two versions of the same measure that are correlated |
|
|
Term
|
Definition
measure extent to which two or more observers, interviewers, or coders get equivalent results using the same instrument |
|
|
Term
|
Definition
describe what a term means |
|
|
Term
|
Definition
describe how you will measure the term. Used to make sure terms have meanings in the sense of verifiability |
|
|
Term
|
Definition
3 conflicting goals: generalizability; precision in control and measurement of variables; realism with respect to context |
|
|
Term
|
Definition
spelling out the logical implications of what is already known or ass |
|
|
Term
|
Definition
process of developing a hypothesis by generalizing from specific instances |
|
|
Term
Hypothesis testing may be done for: |
|
Definition
Discovery, Demonstration, Refutation, Replication, Amelioration |
|
|
Term
A well-formed hypothesis must be: |
|
Definition
Testable, Relevant, Verifiably Predictive, Parsimonious |
|
|
Term
|
Definition
is the incorrect rejection of a true null hypothesis. It is a false positive. Usually a type I error leads one to conclude that a supposed effect or relationship exists when in fact it doesn't The P value is the probability of making a Type I error As N (sample size) increases, the probability of making a Type I error decreases. |
|
|
Term
|
Definition
is the failure to reject a false null hypothesis. It is a false negative. A type II error leads one to conclude that a supposed effect or relationship doesn’t exist when it does. Statistical Power (b) is the probability of avoiding a type II error A large N increases statistical power & reduces the probability of a type II error. |
|
|