Term
| 5 Properties of OLS estimators |
|
Definition
1) β's are unbiased
2) β's are consistent
3) β's are minimum variance of any unbiased estimator
4) MLE=OLS
5) β's are normally distributed |
|
|
Term
|
Definition
∑et=0 (sum of errors=0)
∑etxt=0 (sum of (errors*x's)=0
In matrix form, X'e=0 or X'Xβ^=X'Y.
This means the predicted Y's and the errors are orthogonal.
|
|
|
Term
| Properties of OLS estimators if A1 fails |
|
Definition
1) β's are unbiased
2) β's are consistent
3) β's are NOT minimum variance of any unbiased estimator, but ARE minimum variance of linear unbiased estimators (BLUE)
4) OLS≠MLE
5) NOT normally distributed, but approximately are |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
Best linear unbiased estimator.
If A5 it is consistent |
|
|
Term
|
Definition
|
|
Term
|
Definition
limitn→∞Bias(β^)=0
limitn→∞Var(β^)=0 |
|
|
Term
|
Definition
1-(SSE/(n-k)/(SST/(n-1))
This is a measure of the overall goodness of fit that has the property that adding an additional variable will only increase the adjusted R2 if the absolute value of the t-statistic is greater than 1 |
|
|
Term
| Asymptotic Distribution (asymptotic chi square) |
|
Definition
| f(θ^) is the asymptotic distribution of θ^ if the exact distribution of θ^ approaches f(θ^) as n increases. The actual distribution can equal the asymptotic distribution. For example, if the errors aren't distributed normally, the t-statistic (θ^-θ)/(sθ^) is not an exact t-statistic, but it may have an asymptotic normal distribution. |
|
|
Term
|
Definition
| In time-series data, if the error terms between different times are correlated, this violates A4 and the OLS estimator would not be BLUE. |
|
|
Term
|
Definition
|
Source
|
Sum of squared errors
|
Degrees of freedom
|
Mean squared error
|
|
Model
|
SSR
|
k-1
|
SSR/(k-1)
|
|
Error
|
SSE
|
n-k
|
SSE/(n-k) =s2
|
|
Total
|
SST
|
n-1
|
SST/(n-1)
|
|
|
|
Term
|
Definition
| Variables that take the values of 0 or 1. |
|
|
Term
|
Definition
To test joint hypothesis.
[(SSE*-SSE)/r]/[SSE/(n-k)]~F(r, n-k)
where r=(n-k)*-(n-k) or the number of equal signs in constraint
* meaning the restrained model
Can also look like [(R2*-R2)/r]/[(1-R2)/(n-k)]~F(r, n-k) |
|
|
Term
|
Definition
| a rule used to construct a random interval so that a certain percentage of all data sets, determined by the confidence level, yields an interval that contains the population value. CI for θ=(θ^-t(sθ^), θ^+t(sθ^). Where sθ^ is the standard error. |
|
|
Term
|
Definition
| a data set corresponding to sampling a population at a given point in time |
|
|
Term
|
Definition
| With binary variables, you can only include an intercept if you don't include one of the binary options. (If variable can be 0, 1, or 2 you can eithe include two of those choices with an intercept or all three with no intercept) |
|
|
Term
|
Definition
| Minimum variance estimator. |
|
|
Term
|
Definition
| when a explanatory variable is correlated with the error term. Causes OLS to be biased and inconsistent. |
|
|
Term
|
Definition
| Can test multiple hypothesis. Most common is to test that all variables have no explanatory power. The test is [SSR/(k-1)/[SSE/(n-k)] or [R2/(k-1)]/[(1-R2)/(n-k)]~F(k-1, n-k) |
|
|
Term
|
Definition
| Expresses the price or ln(price) of something in terms of its attributes. i.e. pricehome=f(squarefootage, bedrooms, bathrooms, age, size,...) |
|
|
Term
|
Definition
| Violation of A3 meaning that Var(εt)=σt2≠σ2. OLS is not BLUE. |
|
|
Term
| 3 Characteristics of Instrumental Variables |
|
Definition
1) variable does not appear in equation
2) uncorrelated with error
3) correlated with endogenous regressor |
|
|
Term
| Instrumental Variables Explanation |
|
Definition
To solve endogenous regressor problem. Use instrumental variable Z which satisfies 3 characteristics and then it will be consistent again.
IV=OLS if Z=X |
|
|
Term
|
Definition
| A regressor where two explanatory variables are multiplied together. The marginal impact of one independent variable depends on another explanatory variable. |
|
|
Term
|
Definition
E(Y-μ)4/σ4
This is a measure of peakedness and tail thickness. Kurtosis value for normal is 3. |
|
|
Term
|
Definition
| Test validity of joint hypothesis. LR=2(l-l*)~χ2(r) where r is the number of equal signs in hypothesis. |
|
|
Term
|
Definition
| When Y is a binary dependent variable and is estimated with OLS. There are 3 problems with this. 1) violates A1, 2) violates A3, 3) it is possible for predictions to be outside of (0, 1) |
|
|
Term
|
Definition
Model for binary dependent variables. Pr(Y=1|X)=the integral from -∞ to Xβ of f(s)ds =F(Xβ). f(s) is the log-logistic pdf (es/(1+es)2).
Marginal effect is βiF(Xiβ). |
|
|
Term
|
Definition
| E(Y)=μ meaning the measure of central tendency. |
|
|
Term
|
Definition
| non-zero correlations between explanatory variables in model. Increased collinearity makes the estimators less precise with significant F-test, and insignificant t-tests. |
|
|
Term
|
Definition
| Time-series data. Repeated cross-sectional data over time. Balanced if the same number of data appears each time. |
|
|
Term
|
Definition
| Probability Density Function. Satisfies the conditions that f(s)≥0 and that the integral of f(s) from -∞ to ∞ is 1. Also the Pr(a<Y<b)=integral from a to b of f(s)=F(b)-F(a) which is the CDF. |
|
|
Term
|
Definition
| Binary dependent variable model. Pr(Y=1|X)=integral from -∞ to Xβ of f(s) ds=F(Xβ) where f(s)=(e-s^2/2/√2π). The marginal effect is βF(Xβ). |
|
|
Term
|
Definition
| Probability of a higher value of the test statistic under the null hypothesis. Smaller p-values mean you can reject the hypothesis. |
|
|
Term
| Qualitative Response Model |
|
Definition
|
|
Term
|
Definition
| SSR/SST or 1-SSE/SST. It is the fraction of the model that is explained by the independent variables. |
|
|
Term
|
Definition
| representation of model where each dependent variable is expressed in terms of explanatory variables |
|
|
Term
|
Definition
E(Y-μ)3/σ3
Measures symmetry or asymetry. Positive skewness means long thick tail to right. |
|
|
Term
|
Definition
| data mining. No theory behind it so it is a bad word in economics. |
|
|
Term
| Structural Representation of a Model |
|
Definition
| This is a mathematical representation of the relationship between economic variables implied by economic theory and may include endogenous regressors (endogenous variables) along with exogenous variables on the right hand side of the equations. |
|
|
Term
|
Definition
| tests hypothesis that coefficient is equal to a number. Test is (θ^-θ)/sθ~t(n-k). Also used to test that a linear combination of coefficients is equal to a number. |
|
|
Term
|
Definition
If Y~N[μy, ∑y] then Z=AY~N[Aμy, A∑yA']
We use this to determine the OLS distribution, predictions, and linear combinations of estimators. |
|
|
Term
| Variance Inflation Factor |
|
Definition
1/(1-ρi.)
Measures collinearity. ρi. is defined as the R2 obtained from regression the ith X on the other X's. |
|
|
Term
| estimator and distribution of β1 |
|
Definition
Ybar-β2^Xbar
~N[β1, σ2/n +Xbar2σ2/∑(Xt-Xbar)2 |
|
|
Term
| Estimator and Distribution of β2^ |
|
Definition
Cov(X, Y)/Var(X)
~N[β2, σ2/∑(Xt-Xbar)2]
|
|
|
Term
| Estimator and Distribution of β (Matrix Form) |
|
Definition
(X'X)-1X'Y
~N[β, σ2(X'X)-1] |
|
|
Term
|
Definition
σ2/n +(Xt-Xbar)2σ2/∑(Xt-Xbar)2
|
|
|
Term
|
Definition
|
|
Term
|
Definition
| (n-k)s2β/σ2β~χ2(n-k) where s2=SSE/(n-2) |
|
|
Term
|
Definition
| Includes lagged dependent variables. Can use the Koyck method for estimating. |
|
|
Term
|
Definition
| Regress Y on lagged Y's and lagged X's. If the coefficients of the X's are nonzero then, X granger-causes Y. The test can be done using an F or Chow test. |
|
|
Term
|
Definition
| A plot of the correlation between a variable at different lag lengths. |
|
|
Term
|
Definition
| Tests if the ΔY=μ+ut is homoskedastic and uncorrelated. Null hypothesis is that you should use differences. |
|
|
Term
| Difference in Differences Model |
|
Definition
| Use when examining policies with two different time periods (before and after). The interaction term between the two tells you the D in D. |
|
|
Term
|
Definition
| Time series model that includes lagged explanatory variables. Can use Koyck and PDL to get rid of multicollinearity issues. |
|
|
Term
| Koyck Distributive Lag Model |
|
Definition
| Time series regression with lagged explanatory variables. Additional restriction that βi=λiβ0. |
|
|
Term
| Polynomial Distributed Lag Model |
|
Definition
| Time series regression model that includes lagged explanatory variables. Additional restriction that βi=a0+a1i+a2i2+... |
|
|
Term
|
Definition
| A regression model with an integer valued dependent variable which is distributed as a poisson variable with a mean value that is a linear function of the explanatory variables. |
|
|
Term
|
Definition
| When the independent variables show correlation but there is zero explanatory power. The regression is just showing that there is a trend. This can be fixed by adding a time variable. |
|
|