Term
| What is univariate statistics? |
|
Definition
| Each individual gives rise to a single measurement |
|
|
Term
| What is a multivariate statistics? |
|
Definition
| Each individual gives rise to a vector of measurements |
|
|
Term
| What is Longitudinal Data? |
|
Definition
| Each subject gives rise to a vector of measurements, but this represents the same RESPONSE measured at a sequence of observation times. |
|
|
Term
| What can be studied in a longitudinal study? |
|
Definition
1. Changes over time within individuals (ageing effects) 2. Difference among people in their baseline level (cohort effects). |
|
|
Term
| Can LDA be approached with common statistical techniques? |
|
Definition
| No, LDA requires special techniques because the set of observations on one subject tend to be correlated. |
|
|
Term
| Why are special methods needed for LDA? |
|
Definition
| Because the assumption of independence is violated. If we ignored the correlation that exist, we would get inefficient estimates and wrong inferences. |
|
|
Term
| What is the effect on the inferences if we don't take correlation into account? |
|
Definition
1. Pessimistic estimate of precision 2. Smaller than expected test statistic 3. larger than expected p-value 4. wider CI than expected 5. misleading results. |
|
|
Term
|
Definition
| Term used in the comparison of various statistical procedures. EFFICIENCIES are usually defined using the variance or the mean sq error as the measure of desirability. So an inefficient estimator may have a larger MSE or variance (if unbiased) than a more optimal estimator. |
|
|
Term
| What are the most important challenges in LDA? |
|
Definition
1. Observations tend to be autocorrelated 2. Correlation needs to be modeled |
|
|
Term
| What type of response can we have in LDA? |
|
Definition
|
|
Term
| What is Beta L in a longitudinal data? |
|
Definition
| The expected change in Y per unit increase in x with respect to the baseline value. |
|
|
Term
| Who is used as a control in an Longitudinal study? |
|
Definition
| Each individual is used as his own control. In a LS, we can borrow info form other individuals if variability is narrow or from the same person if there is high variability. |
|
|
Term
| What are the 3 approaches used in LDA? |
|
Definition
Marginal Model Random Effect Model Transition Model |
|
|
Term
| What are the characteristics of the marginal model? |
|
Definition
Coefficients describe/compare subpopulations. exp(Beta 1)= ratio of odds of Y for two X groups (ratio of odds of RTI for two vit A groups). With binary responses, models for odds ratios preferred to correlations. Population average models. |
|
|
Term
| What are the characteristics of the Random Effect model? |
|
Definition
| Correlation among Yij caused by a LATENT variable Ui. Conditional, subject specific models. |
|
|
Term
| What are the characteristics of the transition model? |
|
Definition
| Past response has an effect on current response. Conditional models |
|
|
Term
| Name the EDA techniques used for the mean in LDA? |
|
Definition
1. Average and distribution plots (boxplots, quintiles). Best for equally spaced observations 2. Scatterplot of a response vs time. Best with unequally spaced observations. 3. Smoothed plots: used to highlight the response as a function of time or an explanatory variables. Note: all data. Lowess and Kernel 4. Line plots: spaghetti plots. Note: all data 5. Separate cross-sectional and longitudinal information |
|
|
Term
| Name the EDA techniques used for the correlation in LDA? |
|
Definition
1. Empirical covariance 2. Residual "pairs" plots 3. Variograms (for unequally spaced observations) |
|
|
Term
|
Definition
| Zero average plot. Technique where Yij is regresed on tij fot get residuals rij. Then, a one dimensional summary of residuals is chosen gi=median (rij,..rin1). rij is plot vs tij using points (quintiles of gi). Lines are added for selected quintiles. |
|
|
Term
|
Definition
| This is the autocorrelation funciton. Used to study equally spaced data that is roughly stationary. For non-discrete time points, ACF can be calculated from variogram. |
|
|
Term
|
Definition
| An alternative function to describe correlation that describes association among repeated observations with irregular observations times. It is more useful in continuous time situations, where it is hard to compute ACF directly. |
|
|
Term
| When studying the correlation, why is standardization different form regression? |
|
Definition
| Regression assumes that the SD of all observation is equal, and standardization allows for different SD for each category. |
|
|
Term
| What is the basis for the model of the correlation structure? |
|
Definition
| The model will depend on the ACF |
|
|
Term
| What is a stationary process? |
|
Definition
| Stochastic process whose probability distribution at a fixed time or position is the same for all times or positions. |
|
|
Term
| Why are parametric models useful for covariance structure? |
|
Definition
| Parametric modelling is useful when measurements are not made at a common set of times (unequally spaced measurements). |
|
|
Term
| What are the 3 sources of variation that can be observed when interpreting the variogram? |
|
Definition
1. Random intercept: the diff between individuals (tao sq): 2. serial process: diff in the measurement of 1 individual from one time to the next 3. measurement error: the diff bw the obs and the true value All these sources of variation are independent. |
|
|
Term
| What is the assumption regarding variance for the 5 parametric models for the covar structure? |
|
Definition
| Stationary (constant) variance over time. |
|
|
Term
| What are the 4 steps for the model fitting process? |
|
Definition
1. Formulation: EDA. mean and correlation 2. Estimations: weighted least sq for Beta, max likelihood for covar parametes of alfa, iterate WLSq and ML to convergance. 3. Diagnostics: Assess the assumptions for the mean and covar. 4. Inference: calculate CI and HT about parameters of interest. |
|
|
Term
| What is done in the formulation step of the model fitting process? |
|
Definition
1. Look at residuals 2. Create plots, scatterplots matrices and empirical variograms 3. Do you have stationarity in residuals? No, alternatives: transform or random slope. 4. Once stationarity achieved, use empirical variogram to estimate the underlying covar structure |
|
|
Term
|
Definition
| Restricted Maximum Likelihood. Method used for unbiased estimates. |
|
|
Term
| What is done in the ESTIMATION (fit the model) step in the fitting model process? |
|
Definition
| RMLE. When you have alfa, beta and the variance can be estimated. |
|
|
Term
| What is done in the DIAGNOSTICS step of the model fitting process? |
|
Definition
We need to compare the data with the fitted model. 1. Super-impose fitted mean response on a time plot of avg obs responses within each combination of tx and time. 2. Super-impose the FITTED variogram on a plot of the empirical variogram. |
|
|
Term
| What is done in the TESTING step of the model fitting process? |
|
Definition
Compare nested mean models: Wald or LRT using MLE not RMLE Compare nested models for random effect with LRT (conservative approach) |
|
|