Term
|
Definition
an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total score |
|
|
Term
|
Definition
standard deviation squared |
|
|
Term
|
Definition
variance from true differences |
|
|
Term
|
Definition
variance from irrelevant, random sources |
|
|
Term
|
Definition
proportion of the total variance attributed to true variance |
|
|
Term
item sampling / content sampling |
|
Definition
variation among items within a test as well as variance among items between tests |
|
|
Term
|
Definition
an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test |
|
|
Term
|
Definition
estimate of test-retest reliability obtained from tests separated by more than 6 months |
|
|
Term
coefficient of equivalence |
|
Definition
coefficient of reliability that determines the degree of relationship between various forms of a test evaluated by an alternate-forms or parallel-forms test |
|
|
Term
|
Definition
different tests for which the means and variances of the observed scores are equal |
|
|
Term
|
Definition
different versions of a test that have not been constructed to be parallel |
|
|
Term
internal consistency estimate of reliability / estimate of inter-item consistency |
|
Definition
estimate of reliability without using alternate forms or test-retest reliability. This test involves measuring the internal consistency of test items. |
|
|
Term
|
Definition
obtained by correlating two pairs of scores from equivalent halves of a single test administered once |
|
|
Term
|
Definition
a way of splitting a test in which all of the even numbered items comprise one set of scores and the odd numbered items comprise the other. |
|
|
Term
|
Definition
allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test |
|
|
Term
|
Definition
the degree of correlation among all the items on a scale |
|
|
Term
|
Definition
an index of internal consistency. Tests are said to be homogeneous if they contain items that measure a single trait |
|
|
Term
|
Definition
the degree to which a test measures different factors |
|
|
Term
Kuder-Richardson formula 20 (KR-20) |
|
Definition
named because it is the 20th formula in a series; statistic of choice for determining inter-item consistency of dichotomous items |
|
|
Term
|
Definition
may be thought of as the mean of all possible split-half correlations, corrected by the Spearman-Brown formula. |
|
|
Term
|
Definition
the degree of agreement or consistency between two or more scorers |
|
|
Term
coefficient of inter-scorer reliability |
|
Definition
correlation coefficient for inter-scorer reliability |
|
|
Term
|
Definition
trait, state, or ability presumed to be ever-changing as a function of situational and cognitive differences |
|
|
Term
|
Definition
trait, state, or ability presumed to be relatively unchanging |
|
|
Term
restriction of range / restriction of variance |
|
Definition
when the variance of either variable in a correlational analysis is restricted by the sampling procedure used |
|
|
Term
inflation of rang / inflation of variance |
|
Definition
If the variance of either variable in a correlational analysis is inflated by the sampling procedure, then the resulting correlation coefficient tends to be higher |
|
|
Term
|
Definition
when enough time is given but items are too difficult for one to get a perfect score |
|
|
Term
|
Definition
contains item of uniform difficulty but time is restricted so that no one can obtain a perfect score by answering all the items |
|
|
Term
criterion-referenced test |
|
Definition
designed to provide and indication of where a testtaker stands with respect to some variable or criterion |
|
|
Term
|
Definition
an extension of true score theory wherein the concept of a universe score replaces that of a true score |
|
|
Term
|
Definition
test situations that influence scores |
|
|
Term
|
Definition
things like the number of test items, which comprise the universe |
|
|
Term
|
Definition
examines how generalizable scores from a particular test are if the test is administered in different situations |
|
|
Term
coefficients of generalizability |
|
Definition
representation of the influence of particular facets on the test score |
|
|
Term
|
Definition
study wherein developers examine the usefulness of test scores in helping the test user make decisions |
|
|
Term
|
Definition
provides a way to model the probability that a person with X ability will be able to perform at a level of Y |
|
|
Term
|
Definition
|
|
Term
|
Definition
the degree to which an item differentiates among people with higher or lower levels of the trait, ability, or whatever it is that is being measured |
|
|
Term
|
Definition
test items that can be answered with one of two responses (yes/no, true/false) |
|
|
Term
|
Definition
test items with three or more alternate responses where only one is scored correct |
|
|
Term
standard error of measurement SEM |
|
Definition
the tool used to estimate the extent to which an observed score deviates from a true score |
|
|
Term
standard error of a measurement |
|
Definition
the tool used to estimate the extent to which an observed score deviates from a true score |
|
|
Term
|
Definition
a range or band of test scores that is likely to contain a true score |
|
|
Term
standard error of the difference |
|
Definition
a statistical measure that can determine how large a difference should be before it is considered statistically significant |
|
|
Term
|
Definition
estimate of how well a test measures what it purports to measure in a particular context |
|
|
Term
|
Definition
the process of gathering evidence about validity |
|
|
Term
|
Definition
studies of a test's validity |
|
|
Term
|
Definition
absolutely necessary when the test user plans to alter the test in some way |
|
|
Term
|
Definition
relates more to what a test appears to measure to the person being tested than to what it actually measures |
|
|
Term
|
Definition
describes a judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample |
|
|
Term
|
Definition
describes a judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample |
|
|
Term
|
Definition
a plan regarding the types of information to be covered by the items, the number of items tapping each area of coverage, the organization of the items, and so forth |
|
|
Term
content validity ratio (CVR) |
|
Definition
method for determining if an item is essential as rated by several raters |
|
|
Term
criterion-related validity |
|
Definition
a judgment of how adequately a test can be used to infer an individual's most probable standing on some measure of interest |
|
|
Term
|
Definition
an index of the degree to which a test score is related to some criterion measure obtained at the same time |
|
|
Term
|
Definition
an index of the degree to which a test score predicts some measure |
|
|
Term
|
Definition
the term applied to a criterion measure that has been based on predictor measures |
|
|
Term
|
Definition
the standard against which a test or test score is evaluated |
|
|
Term
|
Definition
a correlation coefficient that provides a measure of the relationship between test scores and scores on he criterion measure |
|
|
Term
|
Definition
the degre to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use |
|
|
Term
|
Definition
provide information that can be used to evaluate the criterion-related validity of a test |
|
|
Term
|
Definition
shows the percentage of people within specified test-score intervals who subsequently were placed in various categories of the criterion (passed, failed, etc) |
|
|
Term
|
Definition
graphic representation of the expectancy table |
|
|
Term
|
Definition
provide an estimate of the extent to which inclusion of a particular test in the selection system will actually improve selection. |
|
|
Term
|
Definition
used for obtaining the difference between the means of the selected and unselected groups to derive an index of what the test is adding to already established procedures |
|
|
Term
|
Definition
theory of exploring the utility of a test |
|
|
Term
|
Definition
statistical rules for developing a sequential analysis of a problem that would lead to an optimal decision |
|
|
Term
|
Definition
extent to which a particular trait, behavior, characteristic, or attribute exists in the population |
|
|
Term
|
Definition
proportion of people a test accurately identifies as possessing or exhibiting a particular trait |
|
|
Term
|
Definition
proportion of people the test fails to identify a having a particular characteristic |
|
|
Term
|
Definition
miss wherein the test predicted that the testtaker did possess the trait when in fact he did not |
|
|
Term
|
Definition
opposite of false positive |
|
|
Term
|
Definition
a judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a variable called a construct |
|
|
Term
|
Definition
an informed, scientific idea to describe behavior |
|
|
Term
method of contrasted groups |
|
Definition
prove validity by demonstrating that scores on the test vary in a predictable was as a function of membership in some group |
|
|
Term
|
Definition
proof that a test provides scores similar to an older, established, and valid test of the same construct |
|
|
Term
|
Definition
validity coefficient showing no relationship between test scores on the constructed test differ from validated test they should differ from |
|
|
Term
|
Definition
a class of mathematical procedures designed to identify factors or dimensions on which people differ |
|
|
Term
exploratory factor analysis |
|
Definition
estimating number of factors, deciding how many to maintain |
|
|
Term
confirmatory factor analysis |
|
Definition
a factor structure is explicitly hypothesized and is tested for its fit with the observed covariance structure of the measured variables |
|
|
Term
|
Definition
the extent to which the factor determines the test scores |
|
|
Term
|
Definition
inherent factor in a test preventing accurate measurement |
|
|
Term
|
Definition
when a test systematically under-predicts or over-predicts the performance of a particular group |
|
|
Term
|
Definition
when the slope of one group's regression line differs significantly from another's |
|
|
Term
|
Definition
numerical or verbal judgment that places a person along a continuum |
|
|
Term
|
Definition
scale of numerical or word descriptors that make up the continuum |
|
|
Term
|
Definition
judgment resulting from intentional or unintentional misuse of a rating scale |
|
|
Term
|
Definition
error resulting from the tendency of a rater to be generous |
|
|
Term
|
Definition
resulting from overly sever raters |
|
|
Term
|
Definition
reluctance of rater to give score in the positive or negative extremes |
|
|
Term
|
Definition
extent to which a test is used in an impartial, just, and equitable way |
|
|
Term
|
Definition
for some raters, some ratees can do no wrong |
|
|