Shared Flashcard Set

Details

Title

Assessment and Evaluation

Description

Final exam review for Assessment and Evaluation class

Total Cards

148

Subject

Education

Level

Undergraduate 4

Created

04/23/2011

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Education Flashcards

Cards Return to Set Details

Term

If a teacher made statement that all students performed above average this year on a test, why is this false?

Definition

there has to be students who made average
-statistically not possible

Term

On criterion reference test, items will be more general or specific?

Definition

specific

Term

On a criterion reference test, items will be of the same or varying difficulty?

Definition

same difficulty

Term

On criterion reference test, why are items more specific and of the same difficulty?

Definition

because objectives are written at certain levels, only vary 1 level at the most

Term

What is the idea behind criterion reference testing?

Definition

content mastery

Term

Give example of a condition in instructional objective?

Definition

Given a map, calculator, globe, etc...

Term

What is the main function of a test blueprint?

Definition

determines if items match instructional objectives

Term

What are the components of a well-written objective?

Definition

conditions, criteria, behavior/performance

Term

What can we do with correlations? what can we NOT do?

Definition

make predictions, cannot tell cause and effect

Term

Which correlation is more useful for making predictions: large negative correlation or large positive correlation?

Definition

neither

Term

what is the correlation between test scores and criterion called?

Definition

validity

Term

If a teacher-made test has items that match her objectives, then the test should have high...

Definition

content validity

Term

If you compare a newly constructed rating scale that measured honesty to an older scale that also compared honesty, what validity are we looking for?

Definition

concurrent validity
-not construct, even though it deals with emotions

Term

Why should you be more confident in reliability of a test if it is given to the whole school, not just one grade level?

Definition

you ensure the variability with a heterogeneous (mixed) group

Term

What validity is it when SAT scores are correlated with GPA?

Definition

predictive validity

Term

In test-taker groups, the more similar the population, the less the...

Definition

reliability -doesn't show true population if everyone thinks the same

Term

what is a criterion referenced test?

Definition

Compares to predetermined standards rather than to other students
-measures mastery
-standardized or teacher-made
-shorter and narrower
-reflects extent to which goals are being met

Term

How many options are available when making a checklist?

Definition

Two
-yes/no, observed/not observed, etc

Term

If you test the same people with the same test at a later time, it is what reliability?

Definition

test-retest

Term

If you only have 1 test, what reliability can you calculate?

Definition

Internal ONLY
-item to item comparisons

Term

What kind of validity is most important for regular teachers?

Definition

content validity
-because what we test must match up to our standards

Term

What is construct validity?

Definition

how well test scores represent behavior predicted by idea/theory/etc
-unobservable traits (such as honesty, depression,etc)

Term

If a company selects a new hire based off a test, what kind of validity?

Definition

Predictive validity

Term

What is a norm-referenced test?

Definition

compare current test takers to past test takers (the norm group)
-standardized test
-results are "relative" to the norm group

Term

"Marie is above average" is norm or crit ref test?

Definition

Norm reference
compares her to the norm group

Term

"Marie is in top 25% of her class" is norm or crit ref test?

Definition

Norm referenced

Term

"Marie got 97 out of 100 questions correct" is norm or crit ref test?

Definition

criterion referenced test
-measures mastery

Term

Does norm or criterion reference usually have more variability of scores?

Definition

norm reference

Term

How many items per objective are usually on Norm-ref tests?

Definition

1 or 2 per objective
-covers many objs
-purpose is to rank child

Term

what does a NON-overlapping band mean in band interpretation?

Definition

-possibly a REAL difference in scores or ability
-overlap would mean it is most likely from chance

Term

What is the difference between a true score and an obtained score?

Definition

Error
-obtained does not take error into account (true does)

Term

What is "Alternate Form" reliability?

Definition

reliability b/w 2 diff forms of test given to the same person at 2 different times
-indicatess equivalency of forms (such as test form A & B or retake exams)

Term

What are some sources of error?

Definition

-test itself (errors)
-test conditions (setting, etc)
-test taker (sleepy, not studied)
-scoring errors

Term

What is a true score?

Definition

-value representing a score that is free of error
-impossible to find because all tests have errors
-score that takes error into account

Term

What is content validity?

Definition

how much a test matches/measures the teacher's objectives
-can also apply to state standards

Term

what is the highest level of Bloom's taxonomy?

Definition

evaluation

Term

What are the ranges of SD?

Definition

1 SD = 68%
2 SD = 96%
3 SD = 99%

Term

What is test reliability?

Definition

-measures dependability and consistency over time
-should yield same measurement of same variable
-3 kinds: test-retest, alternate forms, internal consistency

Term

Which measures level of mastery: Norm or Crit?

Definition

Criterion

Term

Can we observe someone "defend"?

Definition

yes

Term

How do you make a performance assessment more of a direct measure?

Definition

include a variety of mediums

Term

Is it possible to produce a test that is completely without error?

Definition

No
-there will always be SOME error

Term

Is norm or criterion referenced tests more useful in classroom decision making?

Definition

Criterion, because you look at mastery rather than comparison to other students

Term

What is SEM?

Definition

-Standard error of measurement
-variability of error scores
-NOT the difference between true score and obtained score
-as reliability goes down, SEM goes up

Term

Jared has obtained score of 86, so we can be 68% sure his TRUE score lies between __ and ___

Definition

83 and 89

1 SD x 3 = 3
86 - 3 = 83
86 + 3 = 89

Term

What kind of test will any test in the content area be?

Definition

Achievement test
-content validity

Term

"Reliability goes up, SEM goes down" is pos. or neg. correlation?

Definition

negative
-they go different directions

Term

What are the advantages of a performance based assessment?

Definition

-Take different forms
-Assess different kinds of skills
-Easily worked in to lessons
-Higher cog effect needed
-active learning

Term

Jared has obtained score of 86, we can be 95% sure his true score lies between __ and ___ (SEM 3)

Definition

80 and 92 (2 SD x 3 = 6 86 + 6 = 86 86 - 6 = 80)

Term

What are the 3 types of validity?

Definition

Content validity
Criterion-related validity
Construct validity

Term

What is the most important validity for classroom teachers?

Definition

content validity

Term

What can you NOT tell from correlation between two variables?

Definition

cause and effect
(can only predict)

Term

What are you looking for when interpreting a band chart?

Definition

seeing which bars overlap and which do not

Term

What is test-retest reliability

Definition

2 scores from same person on same test at diff times
-indicates stability

Term

What are the 3 characteristics of a good obj?

Definition

specific, observable , measurable

Term

What is standard deviation?

Definition

most accurate measure of variability
-includes all scores in a distribution
-estimate of variability

Term

How many items per obj does crit. ref usually have?

Definition

3+ per obj
-emphasis on figuring out where response is lacking
-shows mastery

Term

What is curvilinearity?

Definition

two variables move same at first, to a point, then show negative relationship

Ex: drinking and feeling good, to a point, then you feel sick
so (+)alchohol = (+)feelings
then (+)alchohol = (-)feelings

Term

What validity is it when you compare test items to objectives?

Definition

content validity

Term

What are the 3 different ways to grade a performance assessment?

Definition

Holistic (don't need to know)
-Checklist
-Rating scales

Term

What is "obtained score"?

Definition

= true score (+) and (-) error score

difference is error

Term

The less variability in a test, the lower the...

Definition

reliability
-shorter tests have lower reliability
-not enough items to measure more accurately

Term

How do you get away from multiple choice tests?

Definition

performance assessments
(informal)

Term

What is criterion-related validity?

Definition

how well tests scores correlate to other tests of the same thing
-requires correlation coefficient to be computed

Term

What validity is it when SAT scores are correlated with GPA?

Definition

predictive validity

Term

What is negative correlation?

Definition

2 variable move in opposite directions

-as A increases B decreases or vice versa
-think of negative as BAD relationship leading to a breakup

Term

What can you tell if you calculate less error on one test than another?

Definition

less error = more accurate measure

Term

Is there a "one size fits all" test in regards to validity and reliability?

Definition

No

-validity and reliability may only be appropriate for specific population, when administered by competent user

Term

What is positive correlation?

Definition

both variables involved move in same direction

-as A increases, B increases

think of positive as a good relationship since they move together

Term

In band interpretation, how do you find if a difference in scores is related to chance?

Definition

when there is overlap

Term

If my watch is on 2 oclock but is broken, then the watch is...?

Definition

Reliable but not valid
(always know it will be on 2oclock, but it is only true 2 times each day)

Term

Jared has obtained score of 86, we can be 99% sure his true score lies between ___ and ___

Definition

77 and 95

3SD x 3 = 9
86 + 9 = 95
86 - 9 = 77

Term

Which is narrower? Norm or crit referencing

Definition

Criterion ref

covers few objects because you want to know how well they mastered each one

Term

If you give a condition in an objective, where else must you give that condition?

Definition

in the test
-same condition, same materials, same accuracy

Term

What are some disadvantages of performance assessments?

Definition

-unReliabile
-subjective due to outside influences
-time consuming

Term

What is the rationale behind performance assessments?

Definition

It provides a direct measure of abilities.

Term

Is "with 100% accuracy" a condition?

Definition

No, it does not provide a context like "with a map" or something.

Term

What type of referencing is most appropriate for broad objectives?

Definition

Norm referencing

Term

If student has raw score of 80, and SEM is 3, and want to be 99% confident of Trues score range, what would that range be?

Definition

89-71 (3 SD's, times 3, then plus and minus)

Term

When is it okay to put opinion in a True/False question?

Definition

If you attribute it to a source.
"according to so and so..."

Term

What should you avoid when writing a T/F question?

Definition

-using always/never
-opinion

Term

What is the best way to test organizational thinking skills?

Definition

Essay

Term

What kind of test item is best for assessing high level thinking skills?

Definition

Essay items

Term

What kind of test item are students most likely to guess on?

Definition

True/False and Multiple choice

Term

Which test item is the easiest to score?

Definition

True/False

Term

What level is appropriate for restricted essay items?

Definition

Anything lower than application
-because they do not have free range on the topic, they are restricted

Term

How can you more objectively grade essays?

Definition

Grade one criteria at a time, make the essays anonymous

Term

What is the mean of 81, 83, 82?

Definition

Term

What is the median of 10, 12, 8, 9, 7?

Definition

Term

What are the measures of central tendency?

Definition

Mean, Median, Mode

Term

Which measure of central tendancy is repeated the most?

Definition

mode

Term

Which measure of CT divides a distribution in half?

Definition

average

Term

Which measure of CT is the average?

Definition

Mean

Term

Which measure of CT is the 50th percentile?

Definition

Median

Term

Which measure of Central Tendency is the most stable?

Definition

Mean, because it takes every score into account

Term

Student scores 49 on vocab test, and the mean for the class is 40, with an SD is 3. What is the z-score and what percentage of the class scored higher?

Definition

z-score = 3

less than 1% scored higher

z=x-m

SD

z=49-40

3

z=3 then look at bell curve

Term

What is variability?

Definition

How spread out the scores are in a distribution.

Term

Which measure of a variable is the most dependable?

Definition

Standard Deviation
-Takes all into consideration

Term

If a set of scores has a variance of 0, what can you conclude?

Definition

Everyone in the distribution has the same score

Term

What happens to SEM when you decrease SD?

Definition

When you decrease reliability, you have a more accurate test
-SD is variability between scores

Term

How can you change the level of a multiple choice item?

Definition

Change the distractors
-make them more plausible
-in stem, have them choose "best" answer

Term

Should you grade one whole essay at a time (on the same test, by the same person)

Definition

No, grade by criteria for all essays

Term

What does the "scoring criteria" for short answer and essay items tell you?

Definition

how many points it is worth, and what you will accept as correct

Term

In norma dist, approx what percent of scores lies between T-scores of 40-80?

Definition

84%

Term

NCLB requires assessment at what grades?

Definition

3-8

Term

What does NCLB require?

Definition

that students be tested annually

Term

According to the text, what is the real argument against High Stakes Testing?

Definition

Using one score to make high stakes decisions
-that score is only one snapshot in time
-it also is biased, produces narrow scores, and causes teachers to "teach to the test"

Term

What kind of discussions should teachers have with students about High Stakes Testing?

Definition

Simple and positive

Term

What are the 12 conditions for a HST program according to the American Educational Research Association?

Definition

1. don't use a single score for high stakes decisions
2. everyone should have same resources and learning opportunities
3. validation for each intended separate use (don't use same score to tell graduation, promotion, financing, etc)
4. tell users the possible negative consequences of HST programs
5. test and curriculum are aligned
6. validity of passing scores and achievement levels (what the scores mean)
7. remediation available to those who fail the HST
8. attention to language differences
9. attention to disabilities
10. stick to rules about who will and wont take test (don't tell low-performing students not to come to school that day)
11. sufficient reliability researched for each intended use
12. ongoing evaluation of intended and unintended effects of HST

Term

What is Reconstitution?

Definition

Moving teachers around or not renewing contracts because of test results

Term

What are some test-taking strategies you can teach students for HST?

Definition

-sleep, breakfast, study
-follow directions carefully
-read each item, passages, information carefully
-manage test-taking time
-easier items first
-eliminate options before answering
-check answers after completing test

Term

What does a positively skewed distribution tell you about the scores?

Definition

majority of scores fall below the middle of the score distribution
-there are many low scores, but few high scores

Term

[image]

positively or negatively skewed?

Definition

[image]

positively skewed

-many low scores, few high scores

-most scores fall below the middle

-tail is toward the positive end of curve

Term

[image]

positively or negatively skewed?

Definition

[image]

negatively skewed

-many high scores, few low scores

-scores lump above middle

-tail is toward neg. end of curve

Term

[image]

| | |

1. 2. 3.

what distribution, and label mean, median, mode

Definition

[image]

| | |

1. 2. 3.

positively skewed

1. mode 2. median 3. mean

Term

[image]

| | |

1. 2. 3.

what distribution, and label mean, median, mode

Definition

[image]

| | |

1. 2. 3.

negatively skewed

1.mean 2. median 3. mode

Term

Which measure of CT is most frequently used?

Definition

mean (average)

Term

Which is not affected by extreme scores, the median or the mean?

Definition

median
-represents the middle better when scores are skewed

Term

What are 2 modes in a distribution called?

Definition

bimodal

Term

what are 3 or more modes in a distribution called?

Definition

multimodal

Term

If each score in a dist. occurs with equal frequency, what is the mode?

Definition

no mode (not 0)

Term

What is the least stable measure of CT?

Definition

mode
(a few scores can influence significantly)

Term

In a normal distribution, which measure of CT has the most value?

Definition

none, they are all the same value (all reach the same highest points on the bell curve)

Term

If the mean is 47, the median is 54, an mode is 59, what is the shape of the distribution?

Definition

negatively skewed
goes in same order they are listed

Term

What does the semi-interquartile range do?

Definition

prevents extreme scores from influencing the sensitivity of a range (such as if everyone scores in the 40s but one person scored a 90)

only the middle 50% is computed, top and bottom 25%s are left out

Term

What does each quartiles in SIQR mean?

Definition

Q1 is the point below which 25% of scores lie
Q2 is the median, or 50% of scores
Q3 is the top 25% percent of scores

Term

What is the most commonly used estimate of variability?

Definition

Standard Deviation

Term

What is the most accurate measure of variability?

Definition

Standard Deviation
-includes all scores in a distribution

Term

What does SD tell you?

Definition

how much a single estimated score actually describes all scores within the range

Term

What do large and small score values represent in SD?

Definition

large SD means more variability, smaller SD means less

Term

How can you tell the strength of a correlation?

Definition

how close the numbers are to -1.0 or +1.0

Term

What does the sign (-/+) tell us about a correlation?

Definition

whether it is negative or positive

Term

What does a correlation of .00 mean?

Definition

there is no correlation at all, high and lows in one are associated with highs and lows in another

Term

How do you tell pos or neg correlation from scatterplots?

Definition

the direction of the slope
-from left to right is negative (pointing down)
-from right to left is positive (up)

Term

"Does the test measure what it is supposed to test?" is asking about...

Definition

validity

Term

"Does the test yield the same or similar score scores consistently?" is asking about...

Definition

reliability

Term

"Does the test score closely approximate an individuals true level of ability, skill or aptitude?" is asking about...

Definition

Accuracy

Term

Why must a test have validity?

Definition

To show that it measures what it says it measures
(such as a 3rd grade math test actually testing 5th grade math, or even a math test actually testing reading skills)

Term

If someone inspects test questions to see if they correspond to what should be covered by the test, they are looking at what kind of validity?

Definition

content validity

Term

What is the problem with content validity?

Definition

-hard to tell if construct tests are valid because they are abstract
-only tells if the test LOOKS valid but it might be measuring something else (such as guessing ability,reading skills, etc)

Term

What are the three main forms of validity?

Definition

content validity
criterion-related validity
construct validity

Term

If matching test to the standards, what validity is it?

Definition

content validity
-making sure what is meant to be testing is actually being tested

Term

What is concurrent validity?

Definition

The correlation between two tests that measure the same thing when given to the same people

Term

What is the difference between Construct validity and other forms of validity?

Definition

Content, Concurrent, and predictive all measure against something else (standards, other tests, future scores) while Construct measure something that is hard to test or hasn't been tested before

Term

What evidence is each of these compared to?
Content validity
Concurrent validity
predictive validity

Definition

Content: test compared to standards or objectives
Concurrent: test compared to other tests of same thing
Predictive: test scores compared to future scores or abilities

Term

Which type of validity is evidence for alignment with curriculum?

Definition

content validity
-making sure test matches the standards and curriculum

Term

What does reliability tell you?

Definition

consistency with which a test yields the same rank for individuals who take the test more than once (think broken watch example)

Term

What are the types of reliability?

Definition

Test-Retest
Alternate forms
Internal consistency

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Education Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile