Term
| as rxy approaches 1.00, prediction accuracy _____? |
|
Definition
|
|
Term
| as rxy approaches 0.00, prediction accuracy _____? |
|
Definition
|
|
Term
| what is validity of measurement? |
|
Definition
| whether a test measures what it is supposed to measure |
|
|
Term
| what is validity for decisions? |
|
Definition
| whether a test is useful in making accurate decisions |
|
|
Term
| what is content oriented validity? |
|
Definition
| a judgment of how adequately a test samples behaviour that is representative of the universe or domain of behaviour |
|
|
Term
| how is content validity assessed? |
|
Definition
– Describe the content domain –Determine the areas of the content domain that are measured by each test item –Compare the structure of the test with the structure of the content domain |
|
|
Term
| Besides the content of test items, it is also important to consider: |
|
Definition
• How stimuli are presented to subjects •How responses are recorded and evaluated •What is going through the respondent’s mind |
|
|
Term
| Is it possible to achieve a high level of reliability with little or no content validity? |
|
Definition
Yes. If a test provides a reliable measure of some domain, but fails to measure the particular domain that is of interest. |
|
|
Term
|
Definition
abstract summaries of some regularity in nature |
|
|
Term
| what is construct explication? |
|
Definition
| The process of providing a detailed description of the relationship between specific behaviours and abstract constructs |
|
|
Term
| What is the term for the definition of the construct in terms of concrete behaviours |
|
Definition
|
|
Term
| how is construct validity demonstrated? |
|
Definition
the pattern of relationships between test scores and behaviour measures should match the expected pattern of relationships |
|
|
Term
| What are the 5 possible pieces of evidence for construct validity? |
|
Definition
-evidence of changes with age -evidence of pretest/posttest changes -evidence from distinct groups -convergent evidence -discriminant evidence |
|
|
Term
| what is "evidence of changes with age" |
|
Definition
evidence for construct validity. If a test score purports to be a measure of a construct that could be expected to change over time, the test score too should show the same progressive changes with age to be considered a valid measure of the construct |
|
|
Term
| what is "evidence of pretest/posttest changes" |
|
Definition
evidence for construct validity. test scores change as a result of some experience between a pretest and posttest can be evidence of construct validity E.g., formal education, therapy, medication, and on‐ the‐job experience |
|
|
Term
| what is "evidence from distinct groups" |
|
Definition
evidence for construct validity. A demonstration that scores on the test vary in a predictable way as a function of membership in some group (e.g. people hospitalized for depression should score higher on a test of depression) |
|
|
Term
| what is "convergent evidence" |
|
Definition
evidence for construct validity. Convergent evidence comes from correlations with tests purporting to measure the same or related constructs |
|
|
Term
| what is "discriminant evidence" |
|
Definition
evidence for construct validity. Evidence that the test shows a small or zero correlation with other variables with which it should not be theoretically be correlated |
|
|
Term
| What is the purpose of item analysis? |
|
Definition
-show why a test is reliable (or not) -suggest ways of improving the measurement characteristics of a test |
|
|
Term
|
Definition
| difficulty is defined in terms of the number of people who answer each test item correctly |
|
|
Term
| How is item difficulty measured? |
|
Definition
Item-difficulty Index: percentage of examinees who answer the item correctly |
|
|
Term
If 60 out of 100 examinees get item #1 correct, what is the item-difficulty level for item #1? |
|
Definition
|
|
Term
If 40 out of 100 examinees get item #2 correct, what is the item-difficulty level for item #2? |
|
Definition
|
|
Term
Which test item is more difficult? 1) p = 0.4 2) p = 0.6 |
|
Definition
|
|
Term
| what is the item-endorsement index? |
|
Definition
| The value then refers to the proportion of people who endorsed an item (e.g., checked “true”) rather than the proportion who got the item “correct” |
|
|
Term
| when would the item-endorsement be measured as the item mean |
|
Definition
For items with more answer options than right/wrong (e.g., a 1-5 scale) |
|
|
Term
what are the Implications of Measuring Difficulty as p value? |
|
Definition
-Rather than defining difficulty in terms of some intrinsic characteristic of the item,pvalue is a behavioural measure -Difficulty is a characteristic of both the item and the population taking the test -It provides a common measure of the difficulty of test items that measure completely different domains |
|
|
Term
| What does it mean when pvalue = 0 (item difficulty) |
|
Definition
| nobody chose the correct answer. there are no individual differences in the “score” on that item. |
|
|
Term
| will Dropping all the test items with pvalues of 0 or 1 from a test affect the rank order or the size of the differences between different people's scores? |
|
Definition
|
|
Term
| Do Extreme pvalues directly restrict the variability of test scores? |
|
Definition
|
|
Term
| when is the variability of test scores is maximized? |
|
Definition
|
|
Term
| If the objective is to screen out the very top applicants what kind of items should comprise the test? |
|
Definition
|
|
Term
| test that consists of items with p values all near .2 could be used to select people _________? |
|
Definition
| in the top 20th percentile of the construct |
|
|
Term
| If the test and a single item both measure the same thing, one would expect people who do well on the test to answer that item _____ and those who do poorly to answer that item ____? |
|
Definition
| correctly and incorrectly |
|
|
Term
| A good item discriminates between |
|
Definition
| those who do well on the test and those who do poorly |
|
|
Term
| what is The item-discrimination index (d)? |
|
Definition
a measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly |
|
|
Term
The higher the d (item-discrimination index) value, the _____ the number of high scorers answering the item correctly |
|
Definition
|
|
Term
| what is the formula for item-discrimination index |
|
Definition
d = (U/Nu) - (L/Nl)
U= # people in upper group who passed the item Nu = # people in upper group L = # people in lower group who passed the item Nl = # people in lower group |
|
|
Term
| what is Item-Total Correlation? |
|
Definition
Represents the simple correlation between the score on an item and the total test score |
|
|
Term
| what does A positive item-total correlation indicate? |
|
Definition
| that the item measure the same thing that is being measured by the test |
|
|
Term
why are item-total correlations spuriously high? |
|
Definition
the total contains the item being correlated |
|
|
Term
| how do you correct for item-total correlations are spuriously high? |
|
Definition
| corrected item-total correlation computes compute the totals excluding the item being correlated |
|
|
Term
| what helps us to understand why some items fail to discriminate between those who do well on the test and those who do poorly |
|
Definition
|
|
Term
| how do you calculate the expected number of people to choose each distractor |
|
Definition
| # answering the item incorrectly / number of distractors |
|
|
Term
| When a distractor is extremely unpopular, it _____ the difficulty of the item |
|
Definition
|
|
Term
| what is Item Reliability Index |
|
Definition
| A measure of an item’s contribution to the internal consistency of the test |
|
|
Term
| What are the 4 criteria for selecting test items? |
|
Definition
1) desired length of test 2) desired content of test 3) item-total correlations (should be above 0.2) 4) Item mean (difficulty) should be between .2 and .8 |
|
|
Term
| Item mean (difficulty) should be between |
|
Definition
|
|
Term
| when selecting test items, the reliability should be greater than ____? |
|
Definition
|
|
Term
| what is The simplest method of determining whether a test can be used validly in making decisions? |
|
Definition
| correlate test scores with measures of success or outcomes |
|
|
Term
| what is criterion validity? |
|
Definition
| does the test correlate with a future outcome? |
|
|
Term
| What is concurrent validity? |
|
Definition
| asses a predictor and criterion at the same time |
|
|
Term
| What is predictive validity? |
|
Definition
| asses a predictor and then assess a criterion at a later time |
|
|
Term
| why is predictive validity sometimes impractical? |
|
Definition
| Population in validity study should be similar to general population of applicants (not just high scorers, e.g. range of scores will be restricted if you only look at the high scores) |
|
|
Term
| what is a disadvantage of concurrent validity? |
|
Definition
| The sample (e.g., current employees) may be systematically different from the population in general |
|
|
Term
| What are the 3 things to look at to know if you are making good decisions about test utility? (decision theory) |
|
Definition
-validity -base rate -selestion ratio |
|
|
Term
| What should you do if you have a restriction of range? |
|
Definition
| correction for range restriction |
|
|
Term
| how do you know if a predictor is invalid? |
|
Definition
| there is no difference between high scores and low scores |
|
|
Term
|
Definition
| a measure of how picky you can be, the lower, the pickier |
|
|
Term
| how is selection ratio calculated? |
|
Definition
| # of openings / # of applicants |
|
|
Term
| the smaller the Selection ratio, the ___ the potential utility of a selection battery |
|
Definition
|
|
Term
If 20 openings and 20 applicants, then Selection Ratio = |
|
Definition
|
|
Term
If 5 openings for 20 applicants, then Selection Ratio = |
|
Definition
|
|
Term
|
Definition
| percentage of the population that can currently be thought of as successes |
|
|
Term
| for utility analysis- how can you estimate improvement in the work force from a new selection battery |
|
Definition
| use a Taylor-Russell table |
|
|
Term
| what information do you need to use a use a Taylor-Russell table? |
|
Definition
-validity -base rate -selection ratio |
|
|
Term
|
Definition
| a statistical method of identifying the basic underlying variables that account for the correlations between test scores |
|
|
Term
| what are factors (in factor analysis) |
|
Definition
| an attribute that causes the variables to correlate with each other, or explains the relations among the defining variable |
|
|
Term
| What is the item characteristic curve? |
|
Definition
| a graphic representation of the probability of choosing the correct answer to an item as a function of the level of the attribute being measured by the test |
|
|