Term
as rxy approaches 1.00, prediction accuracy _____? |
|
Definition
|
|
Term
as rxy approaches 0.00, prediction accuracy _____? |
|
Definition
|
|
Term
what is validity of measurement? |
|
Definition
whether a test measures what it is supposed to measure |
|
|
Term
what is validity for decisions? |
|
Definition
whether a test is useful in making accurate decisions |
|
|
Term
what is content oriented validity? |
|
Definition
a judgment of how adequately a test samples behaviour that is representative of the universe or domain of behaviour |
|
|
Term
how is content validity assessed? |
|
Definition
– Describe the content domain –Determine the areas of the content domain that are measured by each test item –Compare the structure of the test with the structure of the content domain |
|
|
Term
Besides the content of test items, it is also important to consider: |
|
Definition
• How stimuli are presented to subjects •How responses are recorded and evaluated •What is going through the respondent’s mind |
|
|
Term
Is it possible to achieve a high level of reliability with little or no content validity? |
|
Definition
Yes. If a test provides a reliable measure of some domain, but fails to measure the particular domain that is of interest. |
|
|
Term
|
Definition
abstract summaries of some regularity in nature |
|
|
Term
what is construct explication? |
|
Definition
The process of providing a detailed description of the relationship between specific behaviours and abstract constructs |
|
|
Term
What is the term for the definition of the construct in terms of concrete behaviours |
|
Definition
|
|
Term
how is construct validity demonstrated? |
|
Definition
the pattern of relationships between test scores and behaviour measures should match the expected pattern of relationships |
|
|
Term
What are the 5 possible pieces of evidence for construct validity? |
|
Definition
-evidence of changes with age -evidence of pretest/posttest changes -evidence from distinct groups -convergent evidence -discriminant evidence |
|
|
Term
what is "evidence of changes with age" |
|
Definition
evidence for construct validity. If a test score purports to be a measure of a construct that could be expected to change over time, the test score too should show the same progressive changes with age to be considered a valid measure of the construct |
|
|
Term
what is "evidence of pretest/posttest changes" |
|
Definition
evidence for construct validity. test scores change as a result of some experience between a pretest and posttest can be evidence of construct validity E.g., formal education, therapy, medication, and on‐ the‐job experience |
|
|
Term
what is "evidence from distinct groups" |
|
Definition
evidence for construct validity. A demonstration that scores on the test vary in a predictable way as a function of membership in some group (e.g. people hospitalized for depression should score higher on a test of depression) |
|
|
Term
what is "convergent evidence" |
|
Definition
evidence for construct validity. Convergent evidence comes from correlations with tests purporting to measure the same or related constructs |
|
|
Term
what is "discriminant evidence" |
|
Definition
evidence for construct validity. Evidence that the test shows a small or zero correlation with other variables with which it should not be theoretically be correlated |
|
|
Term
What is the purpose of item analysis? |
|
Definition
-show why a test is reliable (or not) -suggest ways of improving the measurement characteristics of a test |
|
|
Term
|
Definition
difficulty is defined in terms of the number of people who answer each test item correctly |
|
|
Term
How is item difficulty measured? |
|
Definition
Item-difficulty Index: percentage of examinees who answer the item correctly |
|
|
Term
If 60 out of 100 examinees get item #1 correct, what is the item-difficulty level for item #1? |
|
Definition
|
|
Term
If 40 out of 100 examinees get item #2 correct, what is the item-difficulty level for item #2? |
|
Definition
|
|
Term
Which test item is more difficult? 1) p = 0.4 2) p = 0.6 |
|
Definition
|
|
Term
what is the item-endorsement index? |
|
Definition
The value then refers to the proportion of people who endorsed an item (e.g., checked “true”) rather than the proportion who got the item “correct” |
|
|
Term
when would the item-endorsement be measured as the item mean |
|
Definition
For items with more answer options than right/wrong (e.g., a 1-5 scale) |
|
|
Term
what are the Implications of Measuring Difficulty as p value? |
|
Definition
-Rather than defining difficulty in terms of some intrinsic characteristic of the item,pvalue is a behavioural measure -Difficulty is a characteristic of both the item and the population taking the test -It provides a common measure of the difficulty of test items that measure completely different domains |
|
|
Term
What does it mean when pvalue = 0 (item difficulty) |
|
Definition
nobody chose the correct answer. there are no individual differences in the “score” on that item. |
|
|
Term
will Dropping all the test items with pvalues of 0 or 1 from a test affect the rank order or the size of the differences between different people's scores? |
|
Definition
|
|
Term
Do Extreme pvalues directly restrict the variability of test scores? |
|
Definition
|
|
Term
when is the variability of test scores is maximized? |
|
Definition
|
|
Term
If the objective is to screen out the very top applicants what kind of items should comprise the test? |
|
Definition
|
|
Term
test that consists of items with p values all near .2 could be used to select people _________? |
|
Definition
in the top 20th percentile of the construct |
|
|
Term
If the test and a single item both measure the same thing, one would expect people who do well on the test to answer that item _____ and those who do poorly to answer that item ____? |
|
Definition
correctly and incorrectly |
|
|
Term
A good item discriminates between |
|
Definition
those who do well on the test and those who do poorly |
|
|
Term
what is The item-discrimination index (d)? |
|
Definition
a measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly |
|
|
Term
The higher the d (item-discrimination index) value, the _____ the number of high scorers answering the item correctly |
|
Definition
|
|
Term
what is the formula for item-discrimination index |
|
Definition
d = (U/Nu) - (L/Nl)
U= # people in upper group who passed the item Nu = # people in upper group L = # people in lower group who passed the item Nl = # people in lower group |
|
|
Term
what is Item-Total Correlation? |
|
Definition
Represents the simple correlation between the score on an item and the total test score |
|
|
Term
what does A positive item-total correlation indicate? |
|
Definition
that the item measure the same thing that is being measured by the test |
|
|
Term
why are item-total correlations spuriously high? |
|
Definition
the total contains the item being correlated |
|
|
Term
how do you correct for item-total correlations are spuriously high? |
|
Definition
corrected item-total correlation computes compute the totals excluding the item being correlated |
|
|
Term
what helps us to understand why some items fail to discriminate between those who do well on the test and those who do poorly |
|
Definition
|
|
Term
how do you calculate the expected number of people to choose each distractor |
|
Definition
# answering the item incorrectly / number of distractors |
|
|
Term
When a distractor is extremely unpopular, it _____ the difficulty of the item |
|
Definition
|
|
Term
what is Item Reliability Index |
|
Definition
A measure of an item’s contribution to the internal consistency of the test |
|
|
Term
What are the 4 criteria for selecting test items? |
|
Definition
1) desired length of test 2) desired content of test 3) item-total correlations (should be above 0.2) 4) Item mean (difficulty) should be between .2 and .8 |
|
|
Term
Item mean (difficulty) should be between |
|
Definition
|
|
Term
when selecting test items, the reliability should be greater than ____? |
|
Definition
|
|
Term
what is The simplest method of determining whether a test can be used validly in making decisions? |
|
Definition
correlate test scores with measures of success or outcomes |
|
|
Term
what is criterion validity? |
|
Definition
does the test correlate with a future outcome? |
|
|
Term
What is concurrent validity? |
|
Definition
asses a predictor and criterion at the same time |
|
|
Term
What is predictive validity? |
|
Definition
asses a predictor and then assess a criterion at a later time |
|
|
Term
why is predictive validity sometimes impractical? |
|
Definition
Population in validity study should be similar to general population of applicants (not just high scorers, e.g. range of scores will be restricted if you only look at the high scores) |
|
|
Term
what is a disadvantage of concurrent validity? |
|
Definition
The sample (e.g., current employees) may be systematically different from the population in general |
|
|
Term
What are the 3 things to look at to know if you are making good decisions about test utility? (decision theory) |
|
Definition
-validity -base rate -selestion ratio |
|
|
Term
What should you do if you have a restriction of range? |
|
Definition
correction for range restriction |
|
|
Term
how do you know if a predictor is invalid? |
|
Definition
there is no difference between high scores and low scores |
|
|
Term
|
Definition
a measure of how picky you can be, the lower, the pickier |
|
|
Term
how is selection ratio calculated? |
|
Definition
# of openings / # of applicants |
|
|
Term
the smaller the Selection ratio, the ___ the potential utility of a selection battery |
|
Definition
|
|
Term
If 20 openings and 20 applicants, then Selection Ratio = |
|
Definition
|
|
Term
If 5 openings for 20 applicants, then Selection Ratio = |
|
Definition
|
|
Term
|
Definition
percentage of the population that can currently be thought of as successes |
|
|
Term
for utility analysis- how can you estimate improvement in the work force from a new selection battery |
|
Definition
use a Taylor-Russell table |
|
|
Term
what information do you need to use a use a Taylor-Russell table? |
|
Definition
-validity -base rate -selection ratio |
|
|
Term
|
Definition
a statistical method of identifying the basic underlying variables that account for the correlations between test scores |
|
|
Term
what are factors (in factor analysis) |
|
Definition
an attribute that causes the variables to correlate with each other, or explains the relations among the defining variable |
|
|
Term
What is the item characteristic curve? |
|
Definition
a graphic representation of the probability of choosing the correct answer to an item as a function of the level of the attribute being measured by the test |
|
|