Shared Flashcard Set

Details

7165 Evaluating changes
Evaluating clinically significant change (Final)
45
Psychology
Graduate
11/27/2011

Additional Psychology Flashcards

 


 

Cards

Term
Evaluating clinically significant change
Definition
There are a lot of problems for relying on statistical significance for evaluating change.

Evaluating clinically significant change has become a large area of research bc of this
Term
Outcome measures should be:
Definition
Reliable

Valid

Responsive

The first two are pretty east to meet but you have to be able to detect change (responsiveness) and this is often at odds with reliability
Term
Reliable
Definition
produces same result when administered on 2 or more occasions - consistent
Term
Valid
Definition
Measures what it is intended to measure

Indexed by sensitivity, specificity, correlational/regression analyses
Term
Responsive
Definition
Ability of an instrument to detect clinically important tx effects

No consensus on how to define or quantify this

No gold standard for summarizing responsiveness

Proliferation of responsiveness statistics are available, but one is not determined to be best
Term
Types of Responsiveness
Definition
Internal & External
Term
Internal responsiveness
Definition
Ability of a measure to change over prespecified time frame

Change in a measure within the context of tx

usually evaluated in pretest/posttest repeated measures design

Measures reliable change over time - kind of the opposite of test-retest reliability (wouldn't detect change)
Term
External responsiveness
Definition
more similar to criterion-related validity (being able to predict a person's standing based on their score on your measure

Extent to which changes in a measure over time relate to corresponding changes in reference measure

Reflects relationship b/t changes in a measure & changes in external standard

Changes in measure are not primary interest, changes in external standard are

Examples: changes in blood pressure (measure) and changes in frequency of heart attacks (external standard)

Measures valid change over time of predictor and criterion
Term
What are some indices of change you can use to evaluate the internal responsiveness of a measure?
Definition

Paired t test - test hypothesis that there was no change in avg response over time - Time 1-Time 2/SD/√n

 

Effect size I (cohen's D) = Time 1-Time 2/SDTime 1 - Conventional Cohen standards = .20 small, .50 moderate, .80 large

 

Effect Size II - Responsiveness-Tx Coefficient (aka Efficiency Index) - Ratio of observed change and SD of change scores - Pre-Post/SD change score

 

Effect Size III (Guyatt Index of Responsiveness) - Time 1-Time 2/√2*MSE - Denominator of ES adjusts for spurious changes arising from measurement error - this unique from others - account for msmt error - more popular in other fields

Term
Why is Guyatt Index of Responsiveness unique from other measures of internal responsiveness?
Definition
It accounts from measurement error and spurious changes that arise from it
Term
Interpretation of Effect Size Statistics
Definition
All stats reflect change in a measure over 2 occasions

Observed change in a measure may not reflect important change in an individual's bx (a social validity issue)

One way to validate change is to compare % of changed participants vs % of unchanged participants (Binomial Effest Size Display - BESD - may be a better way to represent this)
Term
Examples of socially valid change
Definition
addresses whether a change in target bx represents a socially important change

- change in academic engaged time and change in ADHD status (does a change in engaged time result or represent a change in ADHD status - that's the more important thing)

Change in aggressive bx and change in conduct disorder status etc
Term
What are some indices of change you can use to measure the external responsiveness of a measure? 
Definition

ROC (Receiver Operating Curve) Method

Correlation

Regression models

Term
ROC Method for external responsiveness
Definition
Sensitivity (measure correctly classified Ss who demonstrate change on external criterion)

Specificity (measure correctly classifies Ss who do not demonstrate change on external criterion)

Provides useful overview of relationship b/t a measure (predictor) and external indicator of change

Major disadvantage is external change criteria must be dichotomized (improved/not improved - sacrifices info on magnitude of change)
Term
Correlation index of external responsiveness
Definition
Correlation b/t change scores on predictor and criterion

How well do change scores on a predictor predict change scores on a criterion?


Let X be social skills predictor scores
Let Y be academic achievement scores
X1-X2=Change score on social skills (Dx)
Y1-Y2=Change scores on academic achievement (Dy)
Correlate Dx with Dy
Term
Regression model of external responsiveness
Definition
Typical regression model: D x =a + byX1+byX2+byX3…..+ error

Allow for multiple predictors of change criterion (changed/unchanged)

Several analyses are possible
- Logistic regression - less assumptions, easier to meet
- Discriminant function analysis - many assumptions
Term
Assessment and analysis of clinically significant change
Definition
Traditionally, change is established by NHST (null hyp stat testing) - comparisions made b/t 2 or more groups before and after tx Researcher tries to reject the null (p > .05) Statistical sig does not equal clinical sig -- stat sig can be obtained by increasing sample size -- clinical sig is more difficult to establish
Term
What is clinical significance?
Definition
can refer to meaningfulness of a symptom in diagnosing a disorder (red spots and measels)

Can refer to reduction of risk factors for disease (reduced blood pressure and fewer heart attacks)

in terms of change, refers to the meaning of observed change in an individual
Term
Determining clinical significance - 2 things needed
Definition
Amount of change large enough it is not due to msmt error (reliable change)

AND a post-tx level of functioning closer to nonclinical population (cutoff point)

Ways of looking at these:
- Reliable Change Index
- Cut off point: 3 possible methods
Term
Reliable Change Index for evaluating first aspect of clinical sig
Definition
RCI = Posttest – Pretest/√2(Spretest√1 – rtest/retest)2 (squared)

Numerator represents difference score

Denom represents msmt error (based on test/retest reliability)

If RCI/1.96 then a reliable change has occurred p < .05
Term
Cutoff points: 3 methods for establishing that post-tx functioning is closer to nonclinical pop
Definition
1. 2 SD from the mean of dysfunctional pop (in direction of functionality)

2. 2 SD from the mean of functional pop (in direction of dysfunctionality)

3. halfway between the means of the functional and dysfunctional populations
Term
Reliable change and C
Definition
c is the recommended cuttoff point

Calculated by:
c = S(nonclinical)M(clinical) + S(clinical)M(nonclinical)/S(nonclinical)_S(clinical)
Term
Reliable change categories
Definition
Improved: reliable change w/o crossing cutoff point

Recovered: reliable change & crosses cutoff point

Deteriorated: change in the direction of dysfunctionality
Term
Things to consider in making sure your clinical sig and outcomes measures are valid
Definition

General measures versus specific measures

 

Monomethod bias (MTMM logic) - sometimes you get consistent change on one method but not other methods (e.g. child self-report says less anxious but parents and teachers show same rates as before) this doesn't look good for tx

 

Construct irrelevant variance

 

Construct underrepresentation

 

Social validity of outcome measure

Type I measures: Socially valid outcomes (arrest rates, dropout rates, ODRs, retention rates, referral rates) Type II measures: Correlate with Type I measures but not socially valid (DOs, DBRs, ratings by others)

Type III measures: Not correlated with Type I or Type II measures

Term
Reliability considerations for clinical sig and outcome measures
Definition
Regression effects
= Affected by unreliability
= Affected by distance from the mean (tails of the distribution)


Use of difference scores

- Errors of measurement are additive
- Pretest score has error & posttest score has error

- Error Pretest + Error Posttest

- Could be solved by using residualized difference scores (regressed change scores based on reliability coefficient)
Term
Primary means of evaluating clinically significant change
Definition
[image]
Term
Meanings of clinical significance
Definition
Amount or degree of change

Reduction of most symptoms

Reduction of some symptoms (not in normative range - medium change)

No symptom reduction but better able to cope with symptoms (no change)

Measurement issues
- standardized instruments
- instruments without norms
- actual change vs perceived change (change is in the eye of the beholder)
Term
Change matrix for determining
Definition
[image]
Term
Indices of change

Gresham (2005)
Cheney et al (2008)
Definition
Absolute change in bx (most liberal)

Percent of Non-overlapping data points (PND)

Percent change from baseline - sensitive to change (but we don't know how large a % change we need to have a clinically significant change

Effect size - sensitive to change

Reliable change index (RCI) - most conservative

Identification of change sensitive bx ratings
Term

Ways to identify change sensitive bx ratings scales:

indices of change Gresham (2005) Cheney et al (2008)

Definition

1.  Paired sample t tests (p < .01) -

 

2. Effect size estimates (Time1 - Time2/SD pooled) -

 

3. Calculate internal consistency reliabilities (coeff alpha) -

 

4. Apply Spearman-Brown to estimate reliabilities for item reduction (kr/1 + (k-1)r) -

 

Spearman-Brown estimate (10 items) .5(.90)/1 + (.5-1).90= .45/.60 = .75 - Spearman-Brown Estimate: (5) items: .5(.75)/1 + (.5-1) (.75)= .60

Term
Absolute change in bx index of change
Definition
Most liberal

Amount of change from BL to post-intervention levels of performance

An individual no longer meeting established criteria for ED

Total elimination of bx px
Term
Informant discrepancies and change
Definition
Meta-analysis of 119 studies showed diff informants (teacher-parent-child) produce discrepant ratings (r = .20s) (Achenbach et al, 1987)

Informant discrepancies also referred to as level of agreement, informant disagreement, discordance among informant ratings etc

No single measure of social bx is a gold standard

No theoretically relevant rationale provided to explain discrepancies among raters - why do they disagree?

Question: How do we meaningfully compare or combine discrepant ratings?

Impact of discrepant ratings:
- Assmt and classification of psychopathology (diff prevalence rates)
- Tx of childhood/adolescent psychopathology (meta-analytic outcomes)
Term
Correlates of informant discrepancies - child characteristics
Definition
Age (less discrepancies for younger)

Gender (no diff)

Ethnicity/Race (lower agreement among AF-Amer samples)

Social desirability (children rate px bx more favorably than other raters)

Px type (externalizing less discrepant)

All correlations in the low-moderate range
Term
Correlates of informant discrepancies - Parent characteristics
Definition
Depression (maternal)

Anxiety (maternal)

stress (few studies)

SES (inconsistent findings)


--- Little attn paid to family characteristics
Term
Theoretical framework

ABC Model (attribution-bias-context)
Definition
Actor-observer phenomenon & perspective taking & recall

Actor-Observer Phenomenon
- Observers of another person’s behavior attribute causes to dispositional/internal qualities
- Teacher may cause of hitting another child to trait of aggressiveness (downplay context)
- Child may attribute own hitting behavior to being teased (context emphasized)
- Most heavily weighted information are parents/teachers (attribution of traits)
-Most behavior rating scales focus on dispositional qualities:
----Shy/timid
----Gets distracted
----Argues
- Virtually all items are decontextualized

Perspective taking & recall
- individuals recall events to support particular views
- individuals ignore events that do not conform to their views
- studies extensively in cognitive dissonance research
- differences in ratings may result from diff perspectives
----parents want intervention for aggressive bx will likely rate aggressive bx highly
- Diff informants access & weight info from memory recall
Term
Implications of ABC model
Definition
- Current methods of assmt can be modified to reduce discrepancies
- context in which ratings occur is crucial in explaining discrepancies
--- teacher's rating ADHD symptoms often discrepant from parent ratings
- No single informant's ratings can be used as gold standard
- one could build context into extant bx rating scales
----hits others when teased or provoked by peers
Term

Ways to evaluate informant discrepancy rates

 

Indices of informant discrepancies

Definition
Correlational analyses -- Pearson r (correlations among diff informants) -- q correlations (Pearson r applied to differnt informants - teacher-parent) Difference scores - Raw and unstandardized - standardized (convert to z-scores) - residual (use one rater to predict another rater's ratings - regression-based approach)
Term
Evidence-based interventions - criteria

APA task force on promotion and dissemination of psychological procedures
Definition
Random assignments of Ss to intervention and control conditions

Careful specification of pop undergoing intervention

Use of a manual detailing intervention

Multiple outcomes measures (raters naive to conditions) - reduces monomethod bias and rater bias

Statistically significant differences b/t intervention and comparison groups

Replication of findings supporting intervention by independent investigators
Term
Construct validity and interventions
Definition
construct test measures exists

Variations in measurement outcomes causally produced by variations in construct

Is intervention capable of changing construct

If intervention can change construct, does variation in admin of interventions causally produce changes in measures of outcome?

Examples:
- does construct of depression exist?
- is depression a multidimensional construct?
- do tx for depression exist and can they change depression?
- are there diff ways of measuring the depression construct?
Term
Categories for classifying evidence for change (Kazdin)
Definition

Best evidence for change

Evidence for probable change

Limited evidence for change

No evidence for change

Term
Categories for classifying change (Kazdin)

Best evidence for change
Definition
- At least 80% of findings from multiple informants show significant results

- No pattern of measurement-specific, informant-specific, or method-specific results (change appears on all measures)

- Majority of evidence for change across range of informants, measures, and methods suggest intervention changes the construct
Term
Categories for classifying change (Kazdin)

Evidence for probable change
Definition
- More than 50% of findings from multiple informants, measures, & methods show significant results

- No clear pattern of informant-specific, measure-specific, or method-specific results

- Simple majority of evidence for change across range of informants, measures, & methods
Term
Limited evidence for change
Definition
- 50% or less of findings across informants, measures, or methods show significant results - Pattern of informant-specific, measure-specific, and/or method-specific results - Sparse evidence that intervention changed the dimension of the construct
Term
 No evidence for change
Definition

No significant results reported

No evidence for change - Intervention did not change the construct

Term
Meta-analytic results and change
Definition
Average effect sizes in child interventions range between .16 to 1.14

Effect sizes informant specific

Average effect size for agoraphobia ranged from .44 to 2.66

Dependent upon source (clinician ratings versus self-report or performance0

Future meta-analyses should classify change according to categories
Supporting users have an ad free experience!