Shared Flashcard Set

Details

Biostatistics 623
MLR, PHR and Log Regression
52
Mathematics
Graduate
12/23/2009

Additional Mathematics Flashcards

 


 

Cards

Term
Name three methods to check for model adequacy (fit)?
Definition
Adjusted Variable Plot: Visual Aid. Look for non linear patterns, unequal variance, outliers, unusual patterns and points.
Residual plot: visual aid. Idem
Assessing Sensitivity to Outliers: determine sesnititvity of results (robustness) to unusual observation by comparing model with and W/o this unusual observations.
Term
How do you interpret and interaction term in a logistic regression?
Definition
An interaction term in a Log OR is a ratio of OR (OR/OR)= OR-OR.
The factor by which OR of being an OUTCOME case in exp vs non exp is multiplied for exp2 vs non-exp 2
Term
What is the difference between logit and losgitic in STATA?
Definition
With logit, you will get the coefficient.
With logistic, you get the OR.
Term
Can an F test be used for Logistic regression?
Definition
No. for Log regression you would use a Likelihood ratio test (LRT). The change obtained is identical to the change observed when doing ANOVA for linear models (with F test).
Term
What will the LRT allow us to do?
Definition
Compare between a nested and an extended model
Term
What will AIC allow us to do?
Definition
Compare across different models.
Term
How can you select a model that better fits the data in a logistic regression?
Definition
visually (Lowess Smoothing) or by doing a Liklihood Ratio Test (select the model with lowest LRT, which allows to make PAIRWISE comparisons).
Term
How can we deal with potential confounding?
Definition
Stratify
Adjust (with regression analysis).
Propensity Scores (especially when we have many covariates).
Term
What is Woolf's Method for Pooling OR estimates?
Definition
It is a statistical method that gives more weight to less variable (more precision) variables when pooling them to estimate an OR.
Term
What is a Propensity Score?
Definition
It is a method use to deal with the possibility of having multiple confounders. It is the PROBABILITY of being treated as a function of the potential confounders. The PS is always derived from a logistic regression. Since it is a probability it ranges from 0 to 1. Ps is the Pr that the major risk factor = 1 |all the other covariates. No method controls for unmeasured confounders.
Term
Assumptions of MLR?
Definition
L:Linear: relation between y and X's is a line.
Independent: each observation is independant.
N: Normal: each X has associated Y's that follow a normal distribution.
E: Equal variance. Variability is equal or constant.
This is assessed with a histogram
Term
What is the interpretation of B1 in an AVP?
Definition
It is the ASSOCIATION between X1 and Y after adjusting for all the other X's. AVP is a graphical aid that allows us to visualize an MLR coefficient. It is a way to keep examining our assumptions. We could see in this plot a non-linear relation that was missed
Term
ANOVA
Definition
Are the group means the same? ANOVA model explores differences in expected outcomes between categories of categorical variables.
Term
Linear Spline?
Definition
Does the trend change at a fixed point? It is a broken arrow relationship.
Term
ANCOVA
Definition
Are the trends the same in 2 or more groups? Used to test for interaction of effect modification (e.g. does the difference between X1 and Y differs by the level of X2?
Term
What is confounding?
Definition
In order for variable C to be a confounder of the relation of Y and X, C needs to be related to Y and X and not in the causal pathway. IF in the casual pathway, this would be a mediator. To check for confounding, 1 we do a SLR between X and y and then we see if when adding C the coefficient is changed by more than 15% (rule of thumb).
Term
Does the rate of change in avg AC per mth of age vary by age?
Definition
SLR
Term
What test would we use for an individual Bj in and MLR?
Definition
A partial T-test or a Wald test with its 95% CI
Term
When will we use the lincom command in STATA (Bi+Bj in an MLR)?
Definition
It is used to test the hypothesis Ho: Bj+Bi=0 and it CI
Term
How would we compare a null model to an extended model?
Definition
We would use a global test for the added covariates in the extended vs null model. With MLr this would be done with F test, that will look at the improvement in the fit of the predicted values.
Term
Selecting an MLR depends on?
Definition
1. Question of interest
2. Purpose (e.g. etiology, adjustment, prediction, differing costs for measuring X's).
3. Criteria used: Cross-validation, AIC, R squared (we won't use this one. It is not reliable).
Term
What is the range of odds, p and log odds?
Definition
p(0-1)
odds(0 to inf)
log odds= (inf to inf)
Term
What test would we use for an individual Bj in and Log Regression?
Definition
We would use a Z, not a t-test as for MLR.
Term
Selecting a Log Regression Model depends on?
Definition
1. Question of interest
2. Purpose (e.g. etiology, adjustment, prediction, differing costs for measuring X's).
3. Criteria used: AIC, cross validated classification error (specificity, sensitivity, ROC curves)
Term
How do you inspect for model fit in a Log Regression?
Definition
1. Look for patterns, influential points and changes in variance.
2. Check influence of influential points (dfits).
3. Hosmer-Lameshow goodness of fit test.
Term
What is the median survival time?
Definition
It is the point at which the survival function= 0.5. Point in time in which 50% of your sample has not had the event.
Term
In a survival analysis, what is meant by uncensored data?
Definition
Uncensored data means that we have information for the exact time when the event took place.
Term
What is the Survival function S(t)?
Definition
It is the probability of surviving beyond time t
Term
What are some ways of dealing with censoring?
Definition
1. Ignore incomplete cases. This is a bad idea because it will generate bias. We would never do this.
2. Impute an event time. You feel in censored data based on assumptions made from probability model.
3. Use available information of each participant.
Term
In survival analysis, what is meant by ungrouped survival data?
Definition
This means that we have individual information for the exact time that an event occurs. We would use KM or Cos regression model.
Term
The Pr distribution of survival times can be described by...?
Definition
1. Cumulative distribution function.
2. Survivor function= 1- CDF
3. Hazard function
4. Density function
Term
What is Hazard?
Definition
H is the Pr (have the event in this interval|no interval yet)/length of the interval. H=# of event at t=i (Y)/# at risk at t=i (n)
If hazard is large, the survival curve decreases rapidly. H(t) = Lamda(t)12
Term
What is the KM estimate?
Definition
It is the Product of 1-hi
Term
What is the Weibull survival distribution?
Definition
It is a general form of the exponential distribution of survival times. This is used since we can not assume that all survival times follow an exponential decay. The Weibull distribution throught the complementary LOg-log transformation will give us avisual of the survival function giving of a straight line relation between CLL vs log t. With Weibull of p>1 Hazard increasing with time; p=1 H is constant; p<1 H decreases with time.
Term
What is constant hazard?
Definition
This means that the risk in a small interval is always the same.
Term
What type of probability is the hazard?
Definition
It is a conditional probability. Pr of event|no event yet
Term
What type of probability is the survivor function?
Definition
It is a cumulative probability. S(t)= Pr of event free at Tj
Term
What are two of the main function of the Complementary Log-Log Function?
Definition
1. Visual aid that contrast the CLL vs log time. We need to see a straight line. 2. And to estimate the 95% CI.
Term
What is the long rank test?
Definition
Test that is used for hypothesis testing in survival analysis to compare 2 different survival curves under the Ho that the curves are equal. The weight that is used in the Long Rank Test over time is the same. There are alternative test with different weight for each time, like the generalized Wilcoxon test and the Gehen-Peto-Wilcoxon.
Term
What is done with grouped survival data?
Definition
We have to bin time into intervals. With ungrouped data we have the exact times. With grouped data we use life tables.
Term
What type of model is used for ungrouped data?
Definition
Cox proportional Hazards regression model
Term
What type of data is used for grouped data?
Definition
The log linear Poisson Regression model
Term
What are the assumptions for the Poisson Regression Model (Log-linear Model)?
Definition
1. Persons in the cohort are independent.
2.For person i, the chance of an event in a period of time is proportional to the length of the interval.
If this assumptions are met, the events observed in the interval will have a Poisson distribution (Log-linear Model)
Term
What is the parameter of the Poisson (Log-liner model)?
Definition
Lambda. Lambda is also referred as the hazard rate or incidence rate given that is refers to the expected number of events/person yrs
Term
How do we make inferences for the Poisson Regression model?
Definition
It is similar to what is used for logistic regression:
1. Maximum Likelihood to calculate B's and SE.
2. Test nested models using LRT for comparing null vs extended models.
3. Use AIC to compare models.
Term
What is the Proportion Hazards Model (PH) used for Poisson and Cox PH models?
Definition
The covariate effect is the same at all follow up time.
Term
What is the difference between constant and Proportional Hazard?
Definition
Constant hazard refers to hazard that does not change over time. In PH, the hazard might increase or decrease, but the ratio over time is maintained.
Term
What is the meaning of the intercept in a cox proportional hazard model?
Definition
In CRHM we dont have an intercept, here it is replaced by the baseline hazard
Term
What is the meaning of B1 in a CPHM?
Definition
It is the log of the hazard ration or the relative hazard
Term
What is the interpretation of b1 in a Cox Proportional Hazard Model?
Definition
B1: difference of log hazard rate comparing the group coded higher vs lower
exp b1: hazard ratio comparing the group coded higher vs lower
Term
What is the Weibull distribution?
Definition
It is a general form of the survivor function. It is helpful to describe the distribution of survival times
Term
What is the Complementary log-log transformation?
Definition
It is the log of the negative log of the S(t). It has 3 uses:
1. Plot CLL vs log t. If we get a staight line then the Weibull distribution fits.
2. We can estimate CI and SE
3. Check for proportional Hazards. If it is met, we would se parallel lines in a a graph.
Supporting users have an ad free experience!