New York University/Econometrics I

Econometrics I: Applied Econometrics

Stern School of Business

Professor W. Greene
Department of Economics
Office:;MEC 7-90, Ph. 998-0876
e-mail: wgreene@stern.nyu.edu
WWW: http://people.stern.nyu.edu/wgreene

Abstract: This is an intermediate level, Ph.D. course in Applied Econometrics. Topics to be studied include specification, estimation, and inference in the context of models that include then extend beyond the standard linear multiple regression framework. After a review of the linear model, we will develop the asymptotic distribution theory necessary for robust estimation and inference and analysis of linear and nonlinear models. We will then turn to instrumental variables, maximum likelihood, generalized method of moments (GMM), and two step estimation methods. Inference techniques used in the linear regression framework such as t and F tests will be extended to include Wald, Lagrange multiplier and likelihood ratio and tests for nonnested hypotheses such as the Hausman specification test. Specific modelling frameworks will include the linear regression model and extensions to models for panel data, multiple equation models, time series models and models for discrete choice and sample selection.

Prerequisites: Multivariate calculus, matrix algebra, probability and distribution theory, statistical inference, and an introduction to the multiple linear regression model. Appendices A and B in Greene (2017) are assumed. We will survey the parts of Appendix C that would have appeared in prerequisite courses. A significant part of this course will focus on the advanced parts of Appendices C and D. We will also make use of a few of the results in Appendix E (optimization).

Course Requirements: Grades for the course will be based on:

Midterm examination (30%), The midterm examination will be given in class.
Take home final exam (40%)
Several problem sets and small projects (total 30%).

Course Materials:

Text: The required text for the course is Greene, W., Econometric Analysis, 8^th Edition, Prentice Hall, 2017. Other texts that might be useful are: Wooldridge, J., Econometric Analysis of Cross Section and Panel Data, 2^nd Ed., MIT Press, 2010, which is more advanced than Greene; Woolridge, J., Introductory Econometrics: A Modern Approach, 5th Edition (or later), Southwestern, 2012 (or later) or Gujarati, D., Basic Econometrics, 4^rd Edition, McGraw-Hill, 2004, both of which are less advanced. Note: A useful list of errata and comments submitted by readers of Greene are listed at the website for the text, http://people.stern.nyu.edu/wgreene/Text/econometricanalysis.htm where there is a button for the errata/discussion list. The appendices on matrix algebra, distribution theory and optimization that were provided at the end the text in versions 1-7 are now provided online on the website for the text at the URL noted above. There is a button for these as well.

Software: Some of the outside work for this course will involve using a computer. Students may use any computer software that they are familiar with for this purpose. I will provide a copy of NLOGIT to anyone who wishes to use it. Data sets needed for the exercises will be distributed to the class via the course website. The data sets used for the examples in the text are all available in portable format at the text website.

Readings: A few relevant articles from the literature will be suggested (not required). The papers listed are useful pedagogical literature, and students intending to do empirical research for their dissertations will probably find them worthwhile reading. The others are a selection from a huge literature that should be both interesting and accessible to students in this course.

Course Outline: Reading assignments refer to sections in Greene (2017).

I. The Paradigm of Econometrics: [Chapter 1 (pp. 1-8)], Class Notes 1

A. Modeling in economics
B. Econometrics: statistics, economics, mathematics
C. Econometric modeling: understanding, prediction, control
D. Modeling frameworks:

1. Bayesian and Classical (frequentist) approaches [(Optional) Chapter 16, pp. 694-703]
2. Estimation and inference: Nonparametric, semiparametric, parametric [(Optional) Chapter 12, pp. 465-481]

E. Estimation and inference in econometrics, methodological issues (Angrist and Pischke, 2017)

II. The Linear Regression Model. Specification and Computation

A. The conditional mean function [(Optional) Appendix B.1-B.3, B.7-B.8]; Class Notes 2; regression (Waugh)
B. The classical linear regression model and its functional form

1. The linear regression model [Sections 2.1-2.3]
2. Linear models and intrinsic linearity [Sections 2.3, 6.5]
3. Logs and levels, estimating elasticities [Section 6.5]
4. Functional form and linearity. Transformations and dummy variables [Sections 2.3, 6.1-6.2]
5. Linearized regression and Taylor series, linearity in economic modelling

C. Least squares regression [Chapter 3, Sections 3.1 - 3.3]; Class Notes 3; Class Notes 4 (Frisch and Waugh)

1. Least squares regression [Sections 3.1-3.2],
2. Partitioned regression and the Frisch-Waugh theorem [Sections 3.3, 3.4]
3. Application of partitioned regression: a fixed effects model [Section 11.4]

D. Evaluating the fit of the regression, analysis of variance, adjusted R² [Section 3.5]; Class Notes 5
E. Transformed variables. Principal components [Section 4.9.2]
F. Least squares with restrictions [5.3.2 and (5.3.2a,b)]
G. Functional form, dummy variables, difference in differences, regression discontinuity; [Sections 6.1-6.4]; Class Notes 6

III. The Linear Regression Model. Statistical Inference in Finite Samples

A. Statistical properties of the least squares estimator in finite samples [Sections 4.1-4.3]; Class Notes 7

1. Why least squares?
2. Sampling distributions [Example 4.1]
3. Expectation and unbiased estimation [Section 4.3]
4. The effects of omitted and superfluous variables - The Omitted Variable Formula (A VIR) [Sections 4.3.2, 4.3.3]
5. Variance of the least squares estimator [Section 4.3.4]
6. The Gauss-Markov theorem [Section 4.3.5]

B. Estimating the variance of the least squares estimator

1. Conventional estimation [Section 4.3.4]
2. Multicollinearity [Section 4.9.1]

C. The sampling distribution of the least squares coefficient vector in finite samples [Section 4.3, Appendix C.4]

1. Generalities about sampling distributions [Appendix C.2-C.4]
2. Sampling distributions and properties of estimators [Appendix C.5, Section 4.3]
3. Linear estimation and normality [Section 4.3.6]
4. Efficient estimation, precision, mean squared error [Section 4.3.5]

D. Statistical Inference in the linear model [Appendix C, Sections 5.2, 5.3]

1. Standard results for testing
2. Structural change [Section 6.6], Model selection [Section 5.8]

IV. Asymptotic Theory

A. Large sample distributions, asymptotic and limiting distributions [Appendix D], Class Notes 8
B. Robust estimation of the covariance matrix [Section 4.5], Class Notes 9

1. Unknown heteroscedasticity
2. Clustering

B. Basic large sample results for the classical model [Section 4.4]
C. Introduction to bootstrapping; [Sections 4.5.4, 15.4]; least absolute deviations, quantile regression [Section 7.3, 12.3.3, 15.4]
D. Large sample results for a function of a statistic - the delta method [Section 4.6],

V. Inference; interval estimation and hypothesis testing

A. Interval estimation; [Section 4.7] Class Notes 10
B. Prediction with the regression model; [Section 4.8], Class Notes 10; the Oaxaca decomposition [Section 4.7.2]
C. Test procedures, [Sections 5.1-5.4], Class Notes 11

1. Finite samples; the t, F statistic [Section 5.3.2]
2. Large sample tests; Wald and Lagrange multiplier, [Sections 5.3.1, 5.3.3]
3. Specification test: RESET [Section 5.8.1]
4. Robust inference, [Section 5.4]

VI. Endogeneity, Instrumental Variables and Treatment Effects [8.1 - 8.5], Class Notes 12, Class Notes 13

A. Instrumental variables estimation and measurement error [Sections 8.1 - 8.4]
B. Two stage least squares [Section 8.4]
C. Using control functions [Section 8.4.2]
D. Treatment effects, matching [Section 8.5]
C. The Hausman and Wu specification tests [Section 8.6.3]
D. Weak Instruments [Section 8.7]
E. Natural Experiments and Causal Effects [Section 8.10]

MIDTERM
VII. The Generalized Regression Model Class Notes 14

A. Nonspherical disturbances [Section 9.1]

1. General formulation [Section 9.1-9.3]
2. Heteroscedasticity [Sections 9.5-9.7] (Harvey)

B. Implications for least squares [Sections 9.2, 9.3]

1. Robust covariance matrix estimation [Sections 4.5, 9.2] (White, Newey/West)
2. Bootstrapping and clustering
3. Clustering

C. Testing for nonspherical disturbances [Section 9.6]
D. Generalized least squares and weighted least squares [Section 9.5]
E. Two step feasible GLS estimation, familiar applications [Sections 9.4, 9.7.1]
F. Applications of two step, feasible GLS estimation

1. Seemingly Unrelated Regressions [Sections 10.1-10.3]
2. Autocorrelation [(Optional) Sections 20.1-20.9]
3. Demand System [Section 10.3]

VIII. Techniques for Analyzing Panel Data Class Notes 15 Class Notes 16

A. Traditional Models: Fixed and Random Effects [Sections 11.1-11.6]
B. Robust inference [Sections 11.3-11.5]
B. Random Parameters and Latent Class Models [Section 11.10, 14.15, 15.7-15.8]
C. Treatment Effects and Difference in Differences [Section 6.3]
D. Parameter Heterogeneity [Section 11.10]
E. Endogeneity and treatment effects [Section 11.8]

IX. Two Step Estimation Class Notes 17

Two step estimation [Sections 19.4, 14.7] (Heckman, Murphy/Topel, Terza et al.)

X. Nonlinear Regression Models [Sections 7.1 - 7.2] Class Notes 18

A. Nonlinear regression and nonlinear least squares [Sections 7.1-7.2]
B. Partial Effects, the delta method [Sections 7.2, 4.6]

XI. Methods of Estimation

A. Maximum likelihood estimation [Chapter 14] (Harvey) Class Notes 19

1. Computation [Appendices E.2, E.3]
2. Covariance matrix estimation [Section 14.4.6]
3. Likelihood ratio, Lagrange multiplier tests [Section 14.6,]
4. Binary Choice, Loglinear Models [Sections 17.1 - 17.6] Class Notes 20
5. Poisson regression, stochastic frontier, sample selection [Sections 18.4, 19.2.5, 19.4]

B. Generalized method of moments (GMM) estimation Class Notes 21 [Sections 13.1-13.4] (Newey/West)

Minimum distance estimation
Dynamic panel data models [Sections 11.8.3 � 11.8.5]

XII. Basic Time Series Methods [Sections 20.1, 20.2, 20.5, Chapter 21] Class Notes 22
XIII. Monte Carlo Methods

A. Simulation based classical estimation [Chapter 15, Sections 12.4.1, 12.3.4], Class Notes 23
B. Bayesian inference and estimation [Chapter 16], Class Notes 24

Reading List (annotated)

Angrist, J. and J. Pischke, "Undergraduate Econometrics Instruction: Through Our Classes, Darkly," Journal of Economic Perspectives, 31, 2, 2017, pp. 125-144.

Frisch, R., and Waugh, F., "Partial Time Regressions as Compared with Individual Trends," Econometrica, 1, 1933, pp. 387-401. Purely empirical discovery of one of the fundamental pillars of econometrics, the Frisch-Waugh theorem for partitioning a linear projection. Another high water mark in the literature.

Harvey, A., "Estimating Regression Models with Multiplicative Heteroscedasticity," Econometrica, 44, 1976, pp. 461-465. Very general model for heteroscedasticity. A good companion to Breusch and Pagan. Also illustrates an interesting application of Newton's method and the method of scoring for maximum likelihood estimation.

Hausman, J., "Specification Tests in Econometrics," Econometrica, 46, 1978, pp. 1251-1271. Develops the "Hausman Test," a now widely used specification test that gets around the need for nested models imposed by the conventional likelihood, Neyman-Pearson based tests.

Heckman, J., "Sample Selection Bias as a Specification Error," Econometrica, 47, 1979, pp. 153-161. First in a literature on two step estimation of models. A clever application of two step estimation in a model of nonrandom sampling. Began a debate on sample selection models that continues. Interesting application for the form that methodological progress takes place.

Murphy, K., and Topel, R., "Estimation and Inference in Two Step Econometric Models," Journal of Business and Economic Statistics, 3, 1985, pp. 370-379. Lays out the computations needed for handling two step maximum likelihood or least squares estimation. A now standard result. Applications becoming increasingly common. Worth reading.

Newey, W., and West, K., "A Simple, Positive Semi-definite, Heteroscedasticity and Autocorrelation Consistent Covariance Matrix," Econometrica, 55, 1987, pp. 703-708. The canonical presentation of one of the most important tools in the applied econometricians toolkit. Generalizes White's estimator, and makes feasible many GMM estimators in time series settings.

Terza, J., A. Basu and P. Rathouz, "Two Stage Residual Inclusion Estimation: Addressing Endogeneity in Health Econometric Modeling, " Journal of Health Economics, 27, 2008, pp. 531-543.

Waugh, F., "The Place of Least Squares in Econometrics," Econometrica, 29, 1961, pp. 386-396. Historical piece. Argues that OLS, which at that time, was becoming "old fashioned" and ordinary was underappreciated in economics and produced important results. Sounds like he was about 50 years before his time.

White, H., "A Heteroscedasticity-Consistent Covariance Matrix Estimator and Direct Test for Heteroscedasticity," Econometrica, 48, 1980, 817-838. The White estimator for unknown heteroscedasticity. Remarkably simple yet powerful estimator. A major step toward robust estimation in econometrics. Very important paper. (Unfortunately) not simple reading.