10 Regression with Panel Data
Regression using panel data may mitigate omitted variable bias when there is no information on variables that correlate with both the regressors of interest and the independent variable and if these variables are constant in the time dimension or across entities. Provided that panel data is available panel regression methods may improve upon multiple regression models which, as discussed in Chapter 9, produce results that are not internally valid in such a setting.
This chapter covers the following topics:
- notation for panel data
- fixed effects regression using time and/or entity fixed effects
- computation of standard errors in fixed effects regression models
Following the book, for applications we make use of the dataset Fatalities from the AER package (Kleiber and Zeileis 2020) which is a panel dataset reporting annual state level observations on U.S. traffic fatalities for the period 1982 through 1988. The applications analyze if there are effects of alcohol taxes and drunk driving laws on road fatalities and, if present, how strong these effects are.
We introduce plm(), a convenient R function that enables us to estimate linear panel regression models which comes with the package plm (Croissant, Millo, and Tappe 2020). Usage of plm() is very similar as for the function lm() which we have used throughout the previous chapters for estimation of simple and multiple regression models.
The following packages and their dependencies are needed for reproduction of the code chunks presented throughout this chapter on your computer:
Check whether the following code chunk runs without any errors.
library(AER) library(plm) library(stargazer)