Term
| A sample of examinees who are representative of the population for whom the test is intended is called .... group. |
|
Definition
|
|
Term
| The essential objective of .... is to determine the distribution of raw scores in the norm group so that the test developer can publish derived scores known as norms. |
|
Definition
|
|
Term
| For criterion-referenced tests, norms are... |
|
Definition
| uncommon and not essential |
|
|
Term
| In a frequency distribution, the sums of the frequencies for all intervals will.... the total numbers of scores in the sample. |
|
Definition
|
|
Term
| Which kind of distribution would have the highest number of persons in the superior range? |
|
Definition
|
|
Term
| If test scores are piled up at the low end of the scale, the distribution is said to be... |
|
Definition
|
|
Term
| Suppose that a subject scored at the 94th percentile on a psychological test? What does that mean? |
|
Definition
| The subject's score exceeded 94% of the standardization sample |
|
|
Term
Suppose that a college freshman earned 125 raw points on a vocab test where the normative sample averaged 100 points (SD of 15 points).
Suppose he earned 110 raw points on a spatial thinking tests, where the normative sample averaged 90 points (SD of 20 points).
In which skill area does he show greater aptitude? |
|
Definition
vocab
vocab = (125-100)/15 or +1.67
spatial = (110-90)/15 or +1.00 |
|
|
Term
Which is NOT true of a norm group?
A. homogeneous
B. representative of popu.
C. large, hundreds of subjects
D. tested according to standard procedures |
|
Definition
|
|
Term
| When test scores are expressed as a percentage, with the passing level predetermined, the examiner is probably using... |
|
Definition
| a criterion-referenced test |
|
|
Term
| What are the factors that comprise the "classical theory of measurement"? |
|
Definition
1. factors that contribute to consistency
-stable attribures we are trying to measure
-X = T + e
2. factors that contribute to inconsistency
-characteristics of individual, test, or situation that do not deal with attribute measured but affect test scores (error)
- e = X - T |
|
|
Term
| Why is "true score never known"? |
|
Definition
| because errors in measurement show discrepancy between true obtained scores; obtained score always has error |
|
|
Term
| What are the main sources of error measurement? |
|
Definition
1. item selection: wording, quality, bias
2. test administration: environment, how test taker feels
3. test scoring: subjectivity, multiple-choice tests
4. systematic measurement error: something observed that's not what we're looking for |
|
|
Term
| How do systematic and unsystematic error differ in measurement? |
|
Definition
Systematic error: the test consistently measures something other than the trait for which it was intended
Unsystematic error: effects are unpredictable and inconsistent |
|
|
Term
| What does reliability mean in testing? What is the relationship of reliability and measurement error? |
|
Definition
| Reliability is the consistency or replicability of results. The more reliability, the less error there is. |
|
|
Term
What is temporal reliability and its difference from internal consistency reliability?
How do alternate forms reliability and split half reliability differ? |
|
Definition
temporal: whether data remains consistent over time, i.e. test-retest & alternate forms
internal consistency: whether they look for the same thing over time, i.e. split-half & Spearman-Brown formula
alternate forms (part of temporal): developers make two forms of the same test, give it, then correlate the results
split-half (part of internal consistency): results from both halves of a test are correlated |
|
|
Term
Which type of reliability test would be most appropriate for...
1. tests designed to be given more than once to the same people
2. tests that require factorial purity
3. the same tests scored by different scorers
4. tests that have items ordered by difficulty level |
|
Definition
1. test-retest
2. coefficient alpha
3. interscore
4. split half methods |
|
|
Term
| How does a psychologist use the standard error of measurement to determine how close the obtained score is from the true score? |
|
Definition
Reliability is inversely related to the SEM.
SEM = SD (sq. root sign), 1-r (under sq. root sign)
X = T + e |
|
|
Term
| What does a confidence interval tell us about a true score? |
|
Definition
| This is how confident we are that the score is not due to chance or error |
|
|
Term
| What is item response theory (IRT)? |
|
Definition
aka latent trait theory; analyzing items and scales, developing homogeneous psychological measures, measuring individuals on psych. constructs, and administering psych. tests on computers
item response function (IRF) aka item characteristic curve (ICC) is describes the relationship between the amount of latent trait an indi. has and the probability that he or she will give a designated response to a test item designed to measure such a construct |
|
|
Term
| What are the basic assumptions of classical measurement theory? |
|
Definition
1. meas. errors are random
2. the mean error of meas. is 0
3. true scores and error scores are uncorrelated
4. errors on different tests are uncorrelated
*the variance of obtained scores is simply the variance of true scores + the variance of errors of meas. |
|
|
Term
Calculations for
1. SD
2. T
3. z
4. Standard error of difference
5. Standard error of measurement scores |
|
Definition
1. SD = sum of x (x, aka each indi. score, - Xbar, aka mean of scores) squared; over N, aka total # of scores, - 1
2. T = SD (X, aka indi. score, - M, aka mean); over SD; then + 50
3. z (standard score) = x, aka raw score, - mui, aka mean of popu., over o, aka SD of popu.
4. Standard error of difference = the sq. root of (SEM 1) sq. + (SEM 2) sq.
5. Standard error of measurement scores = SD then sq. root sign, 1 - r (under sign)...r is the reliability coefficient |
|
|