Term
|
Definition
| used to describe qualitative variables that can be categorized based on one or more distinguishing characteristic. variables considered discrete, mutually exclusive and exhaustive category. ie "single, married, widowed, divorced, separated" |
|
|
Term
|
Definition
| Classification of discrete variables; ranking "1st, 2nd, 3rd". intervals can be unequal. numbers used for rankings do not reflect anything quantitative about the variable |
|
|
Term
|
Definition
| ordered categories, categories form a series of equal intervals across the whole range of the scale. encompass properties of both magnitude and equal intervals. no absolute zero point--no absence of variable being measured, such as temperature or intelligence. meaning of interval changes along scale even when math stays same. |
|
|
Term
|
Definition
| in additiona to all qualities of nominal, ordinal, and interval scales, ratio scale consists of true or absolute zero point; permits all types of meaningful math calculations. ie: age, height, weight, scores on 100-point test |
|
|
Term
|
Definition
| takes disorganized set of scores and places them in order in table or graph, showing how many ppl obtained each of the scores |
|
|
Term
|
Definition
| group or series of scores within certain range to make data more manageable |
|
|
Term
|
Definition
| bell-shaped or normal curve |
|
|
Term
|
Definition
| majority of scores fell near the lower end of scale; "tail" extending out to the right |
|
|
Term
|
Definition
| majority of scores fell near the higher end of the scale; tail extending to the left |
|
|
Term
|
Definition
| reflects the peakedness or flatness of a distrubtion. zero value = mesokurtic, similar in height to normal distribution. Positive values: leptokurtic, more peaked. Negative values: platykurtic: flatter than normal (scores spread out rather evenly from lowest to highest) |
|
|
Term
|
Definition
|
|
Term
|
Definition
| middle score. not sensitive to outliers |
|
|
Term
|
Definition
| score that appears most frequently; commonly used with variables that are measured at the nominal level, also quick and easy measurement for other scale types |
|
|
Term
|
Definition
| quick measure of spread of scores; subtract lowest score from the highest score |
|
|
Term
|
Definition
| average amount of variability in a group of scores |
|
|
Term
|
Definition
| most frequently used measure of variability. average distance of test scores from the mean. important stat for interpreting the relative position of an individual within a distribution of test scores |
|
|
Term
|
Definition
| bilaterally symmetrical; mean, median, and mode equal to one another; tails are asymptotic; 100% of scores fall between -3 and +3 SD from mean: 68% between -1 and +1 SD; 95% between -2 and +2 SD; 99.5% between -3 and +3 SD |
|
|
Term
|
Definition
| Direction: positive or negative. Strength: magnitude of relationship. Between -1.00 and +1.00. |
|
|
Term
| Criterion-referenced scores |
|
Definition
| emphasized the use of some criterion or standard of performance to interpret an examinee's test results; most tests and quizzes writeen by school teachers, determined in absolute terms, such as percentages, scale scores, and performance categories |
|
|
Term
|
Definition
| compare an individual's test scores to the test scores of a group of people |
|
|
Term
|
Definition
| the large group of individuals who took the test and on whom the test was standardized |
|
|
Term
|
Definition
| represents the percentage of a distribution of scores that falls (equal to or) below a particular test score |
|
|
Term
|
Definition
| similar to percentile by divide data set into four equal parts. First quartile = 25th percentile; second quartile = 50th percetile; third quartile = 75th percentile. |
|
|
Term
|
Definition
| distance between the first quartile and the third quartile. contains 50% of all values in the distribution |
|
|
Term
|
Definition
| linear transformations of raw scores. ie: z scores, T scores, diation IQs, stanines, sten scores, and other standard score scales |
|
|
Term
|
Definition
| conveys the value of a score in terms of how many standard deviations it is above or below the mean of the distribution. mean for distribution of z scores i 0, SD is 1.0. range of z scores is approx. from -3.0 to +3.0 |
|
|
Term
|
Definition
| fixed mean of 50. fixed SD of 10. T scores will always be positive whole numbers. very popular in psychology, often used to report personality test results |
|
|
Term
|
Definition
| counteracted problems with original conceptualization of IQ as mental age/chronological age. have a mean of 100 and a SD of 15 (usually) |
|
|
Term
|
Definition
| standard scores developed by the College Entrance Examination Board (ETS). used for SAT and GRE. range from 200 to 800, have a mean of 500, SD of 100 |
|
|
Term
|
Definition
| converts raw scores into values 1 to 9. mean of 5 and SD of 2. constant relationship to percentiles--represent a sepcific range of percentile scores in the normal curve (percentile always falls within the same stanine) |
|
|
Term
|
Definition
| similar to stanine, but stens range from 1 to 10. mean of 5.5, SD of 2. |
|
|
Term
|
Definition
| (NCE) normalized standard score, range 1 to 99, mean 50, standard deviation of 21.06 |
|
|
Term
|
Definition
| norm-referenced, developmental scores that represent the average score for children at various grade levels. divided into 10 units. intended to represent a student's test performance in terms of the grade level at which the "average" student's performance matches that of the examinee |
|
|
Term
|
Definition
| represent an examinee's test performance in terms of the age at which the "average" individual's performance matches that of the examinee |
|
|
Term
|
Definition
| the degree to which test scores are dependable, consistent, and stable across items of a test, different forms of the test, or across repeat administrations. Considerations: Reliability refers to the results obtained with an assessment instrument, not the instrument itself. An estimate of reliability always refers to a specific type of reliability. Scores on assessment instruments are rarely totally consistent or error free. All instruments are subject to some degree of error and fluctuation. |
|
|
Term
|
Definition
| an fluctuation in scores that results from factors related to the measurement process that are irrelevant to what is being measured |
|
|
Term
|
Definition
| associated with the fluctuation in test scores obtained from repeated testing of the same individual. carryover effect--when the first testing session influences the scores on the second session. practice effect: during second administration, test-takers' scores may increase because they have sharpened skills by having taken the test the first time. if too long between administrations, confoundation with learning, maturation, or other intervening experiences |
|
|
Term
|
Definition
| the error that results from selecting test itmes that inadequately cover the content area that the test is supposed to evaluate |
|
|
Term
|
Definition
| same test given twice with time interval between testing. coefficient: stability. sources of error: time sampling |
|
|
Term
| Alternate Forms, Simultaneous Administration |
|
Definition
| Equivalent tests given at the same time. Coefficient: equivalnce. Sources of error: content sampling |
|
|
Term
| Alternate forms, delayed administration |
|
Definition
| Equivalent tests given with a time interval between testings. Coefficient: stability and equivalence. Sources of error: time sampling and content sampling |
|
|
Term
|
Definition
| One test is divided into two comparable halves, and both halves are given during one testing session. Coefficient: equivalence and internal consistency. Sources of erro: content sampling |
|
|
Term
| KR Formulas and Coefficient Alph |
|
Definition
| One test given at one time (items compared to other itmes or to the whole test) Coefficient: Internal consistency. Sources of error: content sampling |
|
|
Term
|
Definition
| One test given and two individuals independently score the test. Coefficient: interrater agreement. Sources of error: interrater differences |
|
|
Term
| Internal consistency reliability |
|
Definition
| concerned with the interrelatedness of items within an instrument. evaluated the extent to which the items on a test measure the same ability or trait |
|
|
Term
| Standard error of measurement |
|
Definition
| (SEM)simple measure of an individual's test score fluctuations (due to error) if he or she took the same test repeatedly. estimation of the acuracy of an individual's observed scores as related to the true score had the individual been tested infinite times |
|
|
Term
|
Definition
| indicate the upper limit and lower limit within which a person's true score will fall. CI 68% = X + or - 1.00 SEM. CI 95%, CI 99.5 |
|
|
Term
|
Definition
| 1. Identify the problem. 2. Select assessment instrument(s). 3. Score and interpret results. 4. Report and recommendations. |
|
|
Term
| Not until ___ did assessment become commonplace in the U.S. |
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
| standardized vs nonstandardized. individual vs. group. maximum-performance vs. typical performance. verbal vs. nonverbal. objective vs. subjective |
|
|
Term
|
Definition
| formal vs. informal. direct vs. indirect. natural vs. contried. self-monitoring vs. collateral sources/records |
|
|
Term
| Important points in ethical standards |
|
Definition
| competence: selection, administration, scoring, and interpretation. informed conset. only release data to qualified professional. Confidentiality is an ethical guideline, not a legal right. Sensitivity to diagnoses. Multicultural issues. |
|
|
Term
| Level A Qualification for assessments |
|
Definition
| bachelor's degree and has read the test manual and is familiar with the purpose of testing (ie. teachers) |
|
|
Term
|
Definition
| master's degree and advanced coursework and training in testing/certification (school counselors) |
|
|
Term
|
Definition
| usually a doctoral degree with advanced training and supervised experience in administering particular tests (LDTC/psychologists) |
|
|
Term
|
Definition
| provide testing accommodations for those who are disabled |
|
|
Term
|
Definition
| school records should be kept private including assessment records |
|
|
Term
|
Definition
| students must be tested at the school's expense |
|
|
Term
|
Definition
| privacy of clients' healthcare records |
|
|
Term
|
Definition
| establishes standards emasured through standardized testing |
|
|
Term
|
Definition
| equal access to vocational assessment |
|
|
Term
|
Definition
| provide little meaning to test performance. need to be manipulated to understand test performance. |
|
|
Term
|
Definition
| the percentage of people falling at or below an obtained score; the percentage of people you scored the same of better than on a test |
|
|
Term
|
Definition
| scores indicating how many standard deviation units the raw score is above or below the mean |
|
|
Term
|
Definition
| a score with a mean of 50 and a standard deviation of 10. always positive. few decimals |
|
|
Term
|
Definition
| a score with a mean of 5 and a SD of 2. represents a range of z scores and percentiles (whole number) |
|
|
Term
| normal curve equivalent scores |
|
Definition
| scores that range in equal units along the normal cure. range from 1 to 99. mean = 50. SD: 21.06 |
|
|
Term
|
Definition
| unpredictable factors that affect measurement. not consistent across the sample. called "noise" |
|
|
Term
|
Definition
| factors that affect everyone the same. the error is consistent across sample. called "bias" |
|
|
Term
|
Definition
| proportion of variability explained by true scores. correlation; ranges from 0 to 1. The higher the reliability coefficient: the closer observed scores are to true scores, the lower the percetage of error in scores, the more test scores are correlated with one another |
|
|
Term
|
Definition
| correlate SCORES from 2 administration of the same test. good for a construct that is stable over time. the problem: as time increases, correlations decrease |
|
|
Term
| alternate/parallel/equivalent forms reliability |
|
Definition
| correlate SCORES of similar tests, doesn't have the problem with time (like test-retest), the problem: making "equivalent" tests |
|
|
Term
|
Definition
| the correlation among all the ITEMS in the test. only one form and only one administration: split-half/odd-even reliability; cronbach's coefficient alpha, kuder-richardson formula |
|
|
Term
| split-half/odd-even reliability |
|
Definition
| equal halves of a signle test are correlated. The problem: splitting the test makes it shorter. Reliability increases as test length increases. |
|
|
Term
| Spearman-Brown prophecy formula |
|
Definition
| determines the reliability as if half the test had the same number of questions as the full test |
|
|
Term
| Cronbach's coefficent Alpha |
|
Definition
| The mean of all split half combinations of the test. Used for tests with rating scale type items. |
|
|
Term
| Kuder-Richardson formal (KR-20/KR-21) |
|
Definition
| The mean of all split half combinations of the test. Used for tests with right/wrong items. |
|
|
Term
| Standard Error of Measurement |
|
Definition
| An estimation of the range in which a true score exists. Reliability of a single test score. |
|
|
Term
|
Definition
| a range of scores that we are "confident" that a person's true score falls within |
|
|
Term
|
Definition
| CI 1: 68%, 1 SEM + -, CI 2: 95%, 2 SEM + -, CI 3: 99% 3 SEM + - |
|
|
Term
|
Definition
| Does the test measure what it claims to measure? Are the scores meaningful in terms fo what the test claims to measure? Can we make decisions about a person based upon the test scores? Validity is based upon the test PURPOSE, not on the test itself. (Look at the situation the test was used in.) |
|
|
Term
| Relationship between validity and reliability |
|
Definition
| A test can be RELIABLE but not valid (Scores can be consisten but not measure the construct appropriately.) A test can NEVER be VALID but NOT RELIABLE. A test that is not reliable cannot be measuring the construct appropriately. Reliability is necessary for validity. Validity is NOT necessary for reliability. |
|
|
Term
|
Definition
| The adequacy of items in a test. Do the items measure the construct appropriately? How test takers respond to questions (response process) might reveal inadequacies in test itmes. |
|
|
Term
|
Definition
| The test superficially appears to measure the construct. |
|
|
Term
| Criterion-Related Validity |
|
Definition
| The adequacy of a test score to infer a person's standing on a construct. Examines the relationship between the test score and external variables (or criterion). |
|
|
Term
|
Definition
| the reliationship between test scores and the criterion. |
|
|
Term
|
Definition
| Criterion-related validity, relationship between a test and a criterion at the same time. |
|
|
Term
|
Definition
| How well test scores predict future performance (criterion-related validity) |
|
|
Term
|
Definition
| The extent to which a test is actually measuring a construct. |
|
|
Term
|
Definition
| (construct validity) a strong positive correlation exists between tests that measure the same construct. Compare a new test with a well-established test. |
|
|
Term
|
Definition
| A low correlation exists between tests that measure different constructs. Compare two unrelated tests. |
|
|
Term
|
Definition
| (construct validity) scores correctly differentiat groups of people |
|
|
Term
|
Definition
| (construct validity) scores increase or decrease with age |
|
|
Term
|
Definition
| (Construct validity) determines if a test is unidimensional or has several dimensions (subscales) |
|
|