Term
|
Definition
|
|
Term
Four Pillars of Psychological Assessment |
|
Definition
- Formal Test - Interviews - Formal Test |
|
Definition
o Published and standardized and researched you know how good it is because of research |
|
|
Term
|
Definition
Very formal or can be free floating and open ended |
|
|
Term
|
Definition
o Art to knowing what is relevant and what to observe, both restricted and free environments |
|
|
Term
|
Definition
o Things you assess while searching for something else, may lead from one to another |
|
|
Term
|
Definition
- Anything we use to help us gain information, earliest done in China bur around since we studied behavior |
|
|
Term
|
Definition
- Where assessments took hold in this country, referred to as modern era of assessment - Using psychological tests to classify recruits, use info to get people into appropriate jobs - Device was developed for this: o Army Alpha, group given personality test - Interest in psychological tests is peaked at this time, psychologists used them for everyone – everywhere - This led to rights getting abused when done rampantly without a goal or plan in mind Had to put legislature on them to prevent abuses, but didn’t take place until 1970’s |
|
|
Term
|
Definition
o group given personality test o grandparent of the current classification test (verbal) o Found a huge illiteracy rate, entrants couldn’t read or write to take the test o Developed Army Beta (nonverbal) still group administered |
|
|
Term
|
Definition
- Encompassing for everyone everywhere - Used as foundation for further legislature for psychological testing |
|
|
Term
Diana versus Board of Education (1971) |
|
Definition
- Plaintiffs were all Hispanic with English as second language with tests in English and scoring badly, so they were put into special education and though to be retarded - Violated by giving test without parents permission, then it being in English, then segrating them. - Changed by setting had to have parents permission for testing, then tests in first language, then if found deficit parents had to permit them to be in a special class - Applied to everyone in California any nationality then other states addressed similar issues and got to a critical mass |
|
|
Term
Education of Handicapped Act (1974) |
|
Definition
- Federal Law to get parents permission in first language, and parent permission for placement - Gave a lot of jobs to psychologists - Created comparable legislature in jobs, institutions, etc. |
|
|
Term
|
Definition
- Anything that accompanies a job must be relevant to the job - Example: Interview, tests, applications, etc. - EEOC the watchdog of the industry (Equal Employment Opportunity) |
|
|
Term
|
Definition
- Happened because of individual cases repeating over different states - Laid the foundation |
|
|
Term
Case of Larry P. (1970’s) |
|
Definition
- In California a class action suit parents of African American child - Intelligence testing were bias and put the children at a disadvantage because it was inappropriately normed - Law passed that you couldn’t use conventional intelligence tests for these children in California - Then in Illinois they didn’t pass the law |
|
|
Term
|
Definition
- Reliability o How consistent a test is, if not consistent it is worthless - Validity o Does it measure what we want it to, the accuracy of the test - they are interrelated , but different - must have reliability to have validity |
|
|
Term
- Discrete Data (Dichotomous) |
|
Definition
o Fits into a specific category - Naturally occurring traits or characteristics can be either
|
|
|
Term
|
Definition
o Exists in some degree o Example: height, weight, speed, intelligence, etc. o More complex data is continuous - Naturally occurring traits or characteristics can be either |
|
|
Term
|
Definition
- Used for continuous data - Some degree |
|
|
Term
|
Definition
- Used for discrete data - Specific Category |
|
|
Term
|
Definition
- Nominal - Ordinal - Nominal |
|
Definition
o Non parametric/Continuous Data o Example: Name, social security number, etc. o not very descriptive, but a good start, can use for anything with a name |
|
|
Term
|
Definition
o Non parametric/Continuous Data o Names and ranks order o Tell a little more than nominal |
|
|
Term
|
Definition
o Parametric/Discrete Data o Name, rank, and numbers o Qualitative jump, can tell amounts and intervals o Can’t give absolute rates |
|
|
Term
|
Definition
o Parametric/Discrete Data o Absolute zero o If you’re going to measure something can’t conceive zero |
|
|
Term
|
Definition
- Discrete, nominal, and ordinal |
|
|
Term
|
Definition
- Continuous, ratio, interval |
|
|
Term
|
Definition
- Entails everything that contains a certain attribute we are measuring - Very large, but can be small as well - All members that have a certain trait |
|
|
Term
|
Definition
- Same population small section representative of the population - Must reflect the population |
|
|
Term
Two qualities the sample must have to reflect the population |
|
Definition
- Variety/Hetergeneous/Diverse - Large (largeness alone will not ensure diversity) |
|
|
Term
|
Definition
- Allows us to collapse our data - Take a lot of data and make it more manageable |
|
|
Term
|
Definition
- One number that best represents the data/sample - Mean/average, median, and mode |
|
|
Term
|
Definition
- Average - Used often - Is very susceptible to high or low scores - Example: 1, 2, 2, 1, 3 average would include and reflect scores - Example: 1, 2, 2, 1, 30 average would not reflect central tendency |
|
|
Term
|
Definition
- Fifty percent of scores above and fifty percent of score below - Not really used in statistics |
|
|
Term
|
Definition
|
|
Term
Measure of Variability/Dispersion |
|
Definition
- Shows the diversity in the sample - Standard deviation, range, variance |
|
|
Term
|
Definition
- Scores show central tendency and how much dispersion - Measure of error |
|
|
Term
|
Definition
- Easy to get can’t do anything with it |
|
|
Term
|
Definition
- Bell shaped - Evenly distributed - If sample is large and diverse and SD is low you should have normal distribution - Mesokurtic |
|
|
Term
|
Definition
- Refers to the shape - Positively Skewed: - Negatively Skewed:
- Bimodal Distrubution: |
|
|
Term
|
Definition
o Skewed to the right o Because mean is more positive than it should be so it throws it off |
|
|
Term
|
Definition
|
|
Term
|
Definition
- if put positive and negative skewed together |
|
|
Term
|
Definition
- steepness of the curve - Platykurtic - Leptokurtic - Platykurtic |
|
Definition
o sample with good diversity but low number o still symmetrical but not bell shaped o add subjects to fix it |
|
|
Term
|
Definition
o Extreme population o Large number sample, but low diversity o Symmetrical but not normal |
|
|
Term
|
Definition
o Normal distribution o high number of sample and high diversity |
|
|
Term
|
Definition
- Can convert between T, Z, and A scores because normal distribution - All have mean and standard deviation - Based on interval data - Z Distribution, Z score
- T Distribution, T score
- A Distribution, A score
- IQ Scores |
|
|
Term
- Z Distribution, Z score |
|
Definition
o X = raw score o Z score = raw score expressed in standard deviation o Raw score minus the mean divided by standard deviation o In Z distribution the mean is always zero |
|
|
Term
- T Distribution, T score |
|
Definition
o Mean equals 50 o SD equals 10 |
|
|
Term
- A Distribution, A score |
|
Definition
o Mean equals 500 o SD equals 100 200 | 300 | 400 | 500 | 600 | 700 | 800 | |
|
|
Term
|
Definition
|
|
Term
Norms and Standardization |
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
o Process of getting norms |
|
|
Term
|
Definition
o Age o Grade o Percentile (%ile) o Age |
|
Definition
§ Not based upon interval data § Difference between 2-3 yrs old is not the same as 17-18 yrs old § They collapse after a certain age |
|
|
Term
|
Definition
§ Not equivalent 1st grade to 2nd and 11th to 12th § Also not interval data |
|
|
Term
|
Definition
§ Example: 60th %ile means 60% of people scored below you and 40% person scored above you median = 50th %ile |
|
|
Term
o Stanine (Standard Nine) |
|
Definition
§ Divides into nine section |
|
|
Term
How we standardize a test |
|
Definition
- Get a sample, has to be large and heterogeneous |
|
|
Term
|
Definition
- A tool that helps get diversity, random sample - Everyone in your population has the same chance of being chosen for your sample - No randomization you have a quasi-experiment, not a true experiment - Very difficult usually use variations |
|
|
Term
|
Definition
- Divide the members of population into regions then you randomly select individuals from the naturally occurring regions - Difficult to do as well, very expensive |
|
|
Term
|
Definition
- Section of country that does a good job of representing the entire population - Can work if the area is a micro causem of our population |
|
|
Term
Incidental (Convenience) Random Sampling |
|
Definition
- Can make this work - Example: live in Fairmont, take sample from Fairmont State |
|
|
Term
|
Definition
- Tells us how two variables are related |
|
|
Term
|
Definition
- Logical outgrowth of a correlation - When we use one variable to predict another - R = coefficient, R stands for regression |
|
|
Term
Two major types of correlation |
|
Definition
- Positive/Direct Correlation
- Negative/Indirect Correlation
- Number less than one - Gives us an idea of how right or wrong our prediction is - Tells us the strength of our correlation - Must square correlation to interrupt it - The sign + or – DOES NOT tell us the strength - The number DOES tell us the strength |
|
|
Term
- Positive/Direct Correlation |
|
Definition
|
|
Term
- Negative/Indirect Correlation |
|
Definition
|
|
Term
|
Definition
- Used depends on what kind of data you have - Whether your variables are discrete or continuous |
|
|
Term
|
Definition
Variable A | Variable B | Symbol | Job seniority (Continuous) | Money (Continuous) | R | |
|
|
Term
|
Definition
Variable A | Variable B | Symbol | Gender (Discrete) | Money (Continuous) | Rpbi | Biserial |
|
Definition
Variable A | Variable B | Symbol | Seniority (Forced Discrete) | Money (Continuous) | Rbi | |
|
|
Term
|
Definition
Variable A | Variable B | Symbol | Seniority (Forced Discrete) | Money (Continuous) | Rt | |
|
|
Term
|
Definition
Variable A | Variable B | Symbol | Gender (Discrete) | Response Pattern (Discrete) | 0 | |
|
|
Term
|
Definition
Variable A | Variable B | Symbol | Anxiety Levels (More this goes up) | Test Reference (More this goes down) | eta | |
|
|
Term
|
Definition
- Between 2 variables of results, they don’t change positive is always positive and negative is always negative |
|
|
Term
|
Definition
- Some degree of continuous data but somewhere there is a indirect change - What was positive is now negative and so on then you must use eta |
|
|
Term
|
Definition
- Could be subject itself - Actual test presenters/examiners - Actual test itself
- Never can eliminate error, just keep at a minimum - Acknowledge the existence and deal with it - Reliability and validity help us tell how much error there is |
|
|
Term
- Could be subject itself |
|
Definition
o Hawthorne Effect o Example: sick, unprepared, on purpose, etc. |
|
|
Term
Actual test presenters/examiners |
|
Definition
o Rosenthal Effect o Example: persuasion, bias, unprepared, etc. |
|
|
Term
|
Definition
o Example: unclear, bad directions, not normed, etc. |
|
|
Term
|
Definition
- Any assessment device should have reliability and validity - Validity = accuracy - Reliability o Consistency of any predictor o How consistent are we to measure ____________? o X = true score plus error |
|
|
Term
|
Definition
· Anything which we use to forecast or predict something · Examples: observations, interviews, research, tests. |
|
|
Term
Methods of measuring reliability (types) |
|
Definition
§ Test-Retest Reliability |
|
Definition
· Example: give the 1st test, then time passes and you give the 2nd test · Interpret the percent directly, all other correlations you must square it · Advantages: o straight forward · Disadvantages: o difficult to do again o Less individuals return o May have learned, they are not the same people (carryover effect) o Impractical o Takes times and resources |
|
|
Term
§ Equivalent/Alternate Form Reliability |
|
Definition
· Same material, same format · Different questions · Example: give subjects form A at one point in time, then wait and give form B and correlate those results · Advantages: o straight forward · Disadvantages: o may not be equivalent o difficult to do again o Less individuals return o May have learned, they are not the same people (carryover effect) o Impractical o Takes times and resources |
|
|
Term
|
Definition
Only one administration of test Comparing test to itself Three components of Internal Consistency Split-Half (Odd-Even) Cronbach’s Coefficient Alpha Kuder-Richardson 20 (KR20) |
|
|
Term
|
Definition
§ treating the one test as two tests, splitting it in half, compare the even items to the odd items (recommended) § first half (1-25) to last half (25-50) (not recommended because a person can get tired or run out of time.) § Formula is called the Spearman-Brown Formula |
|
|
Term
o Cronbach’s Coefficient Alpha |
|
Definition
§ Inner comparisons to every other item. (Ex: item 1 to 2, then 2 to 3, etc.) with continuous data: interval or ratio (Inner item comparisons) |
|
|
Term
o Kuder-Richardson 20 (KR20) |
|
Definition
§ Inner comparisons to every other item. (Ex: item 1 to 2, then 2 to 3, etc.) with discrete data (Inner item comparisons) |
|
|
Term
How consistent our predicator is… |
|
Definition
- what is the likelihood of getting the same result at a later point in time, comparing the one administration to another. - Rare to get the same exact result but there is an acceptable range. Based on correlation and standard deviation of the test. |
|
|
Term
|
Definition
- measure of error, want it to be the lowest number, measure of diversity, more extreme the more variability |
|
|
Term
Standard Error of Measurements (SEmeas): |
|
Definition
- Standard error is directly related/correlated to the standard deviation of the test, whereas the standard error is negatively/inversely correlated/related to the reliability. - Don’t have to calculate but must be able to explain and understand how it works - used in measure of reliability, based on hypothetical construct or the normal distribution, a type of standard deviation, it measures this - (formula written) Standard Deviation times the square root of one minus the reliability (the more error the more dispersia and vice versa, want high reliability) |
|
|
Term
Standard Error of Difference (SEdiff): |
|
Definition
- Measure very different attributes and characteristics (modalities) of the same individuals, that measure different things. (apples and oranges) - Example: Verbal skills versus Performance skills different but both contribute to mental abilities - (Formula written) Square root of the first test standard error measurement plus the second test standard error measurement |
|
|
Term
Factors that affect reliability: |
|
Definition
- Configuration of your sample o large o heterogenous (diverse) - Items that comprise our test o large numbers (the larger the more reliability the test will be) o content homogenous in nature |
|
|
Term
|
Definition
- Can two different individuals using same instrument and get the same results - Kappa statistic o type of correlation that measures inter-rater reliability |
|
|
Term
|
Definition
- does the predictor measure what you want it to measure (accuracy) - most important quality- has to be accurate - independent from reliability but related - in order for any predictor to be valid it must be reliable (in order for it to be accurate it must be consistent) but just because the given predictor is reliable does not mean it is valid. - Three Main types of validity: content, criterion-related, and construct |
|
|
Term
|
Definition
o Very important in any test that measures achievement or accomplishment o Non-statistical in nature, no real coefficient (number) must have experts in the field agree the content is appropriate. |
|
|
Term
|
Definition
o Just looking at the test, does it look okay, not a true test of validity |
|
|
Term
- Criterion-Reference Validity (Criterion Related Validity) |
|
Definition
o very statistical in nature o little v coefficient ??? (30 min) o relationship between score and ??? (30min) |
|
|
Term
|
Definition
§ some reference point o Two methods of Measuring Criterion: § Predictive Method/Validity § Concurrent Method/Validity |
|
|
Term
§ Predictive Method/Validity |
|
Definition
· Most statistically ideal · Doesn’t have a restricted range, but not practical · Restricted range: o very bad for reliability and validity, want diversity. (30min) · Example: o develop test, then give test to all applicants right up front, then have employer put all applicants on the job, then get ratings on the job by employers, applicants who score the highest on test should have been rated the highest as well, then you can use the test to predict the best employees, v=.70 must square the validity actually only 40% good prediction. |
|
|
Term
§ Concurrent Method/Validity |
|
Definition
· Not as sound · Example: give it to secretaries who are already on the job, as many as you can, the highest rated ones should score the highest on the test, the correlation is used as the coefficient. · Have a restricted range, but there is diversity in the job |
|
|
Term
|
Definition
- .80 are good, even .70 isn’t bad, but once squared it is below 40%, you can used validity range even lower, because we use multiple predictors combined, we never depend on ONE predictor alone. |
|
|
Term
|
Definition
- when we combine multiple predictors and we calculate the total package(battery) - the total won’t be lower than the validity of that single highest package - most likely the combined package will be higher - The high score brings up the low ones. Two methods of Multiple Predictors: - Multiple Regression
- Multiple Cut-offs |
|
|
Term
|
Definition
o seems to be more fair o high scores compensate for low ones o Example: if you had two bad semesters early on in college career, but last two years were great, they would over look it because of good recommendations, and GRE scores, you might get into grad school anyway. |
|
|
Term
|
Definition
o More exclusionary o Must have a minimum score on each predictor to make the cut o Example: wanting to be in the Air Force, scored great on test, but didn’t pass vision test, you won’t get into the Air Force to fly planes. |
|
|
Term
|
Definition
- can be anything - Predictors Example: tests, observations, etc. - Criterions Example: getting into grad school, good grades, etc. - We can set the predictor where ever we want, although we have no control over the criterion , instead we just try to foreshadow it - Example: make it easier to get into medical school, but a doctor cannot function with a low level of knowledge. It would be dangerous to patients. - We want to find a point that indicates the minimum score that reflects the criterion that still occurs |
|
|
Term
|
Definition
- Four quadrants o False Negative/False Rejection (incorrect) § Example: scored bad, didn’t hire, would have been good o True Positive/Valid Acceptance (correct) § Example: scored good, we hired, they were good o True Negative/Valid Rejection (correct) § Example: scored bad, didn’t hire, would have been bad o False Positive/False Acceptance (incorrect) § Example: scored good, we hired, but they were bad - Correct and incorrect decisions based upon the testing - If you decrease the valid acceptance and the false acceptance, you will increase the false rejection and valid rejection and vice versa. - You can only move the predictor line left and right, CANNOT move the criterion line at all. - To make it harder to get in move predictor to the right, especially if it is costly. (Example: the Astronaut program) - To make it easier to get in more predictor to the left - Want to find a point that indicates the minimum score that reflects the criterion still occurs - We can set the predictor wherever we want, but we have no control over the criterion we try to foreshadow it - Example: make it easier to get into medical school, but a doctor can’t practice with a low level of knowledge it would be dangerous |
|
|
Term
|
Definition
- Is how effective is the predictor in helping us make a prediction/decision - Utility is dependent upon three things: o Validity
o Base rate
o Selection Ratio
o In order to have good utility we want high validity, high base rate, and low selection ratio low. |
|
|
Term
|
Definition
§ more valid the better of a predictor § want it as high as possible |
|
|
Term
|
Definition
§ the percentage of correct decisions you can make without using that particular predictor § want it to be as high as possible even 50% § Example: to get into grad school look at other things beside the GRE such as GPA, interview, and reference letters. |
|
|
Term
|
Definition
§ Ratio of the number of openings and the number of applicants § The actual amount of individuals you can actually accept. § Want this to be low, have a lot of applicants and to increase other factors important (GRE) § Example: when it comes down to everything being equivalent, GPA, interview, and reference letters then you will go with the GRE. |
|
|
Term
|
Definition
§ Taylor-Russell Tables · These have taken every combo of base rates, selection ratio, and validity. (see handout) · Enhances predictive value · Test with zero validity is useless SD ratio can’t help |
|
|
Term
|
Definition
- Involved in pure research - It is statistical in nature; do statistical research to get the coefficient - Accumulated over time - Very difficult to come up single test or number to show it - Must show that it is measuring the construct we think it is - Very difficult to use |
|
|
Term
|
Definition
o Abstract psychological concept o Way of explaining something, an idea, theory, or a definition o Example: intelligence, personality, motivation, etc. |
|
|
Term
- Methods of measuring Construct Validity |
|
Definition
o Convergent Validity o Discriminate Validity |
|
|
Term
|
Definition
§ Test must correlate well/highly with measures of the same construct § How we do it? Example: create intelligence test of college students, then study, to see if intelligence tests correlate with mine § Must be consistent with other tests |
|
|
Term
|
Definition
§ must not correlate with measures of a different construct, therefore it must discriminate § Example: your intelligence test should not correlate to personality tests, this would show your intelligence test would be more of a personality test instead |
|
|
Term
Two most fundamental traits of a test |
|
Definition
|
|
Term
|
Definition
o Individual is given enough time to perform the test, but the content is hard o Mastery of the content, produces the high scores |
|
|
Term
|
Definition
o Relatively easy items, but you have a limited time to get them all done o Strictly timed period o Mastery is not accuracy, but how fast you can get them done. o Don’t want to use speed test with internal validity, because you would cut your items down. |
|
|
Term
|
Definition
- Don’t require something in the selection process that you don’t have to perform in the job. - Example: Commissioned by Sears, to create a selection process for automotive employees, both written test and hands on, but you didn’t have to read to do the job, but you required them to read for the test. |
|
|
Term
|
Definition
- Identifies subgroups - Example: You get a large and representative sample, give a test, publish it, works well, but certain group/minority it doesn’t work, so don’t use it for the subgroup because it puts them at a disadvantage. |
|
|