**Open Review**. We want your feedback to make the book better for you and other students. You may annotate some text by selecting it with the cursor and then click the on the pop-up menu. You can also see the annotations of others: click the in the upper right hand corner of the page

## 15.4 HAC Standard Errors

The error term \(u_t\) in the distributed lag model (15.2) may be serially correlated due to serially correlated determinants of \(Y_t\) that are not included as regressors. When these factors are not correlated with the regressors included in the model, serially correlated errors do not violate the assumption of exogeneity such that the OLS estimator remains unbiased and consistent.

However, autocorrelated standard errors render the usual homoskedasticity-only *and* heteroskedasticity-robust standard errors invalid and may cause misleading inference. HAC errors are a remedy.

### Key Concept 15.2

### HAC Standard errors

**Problem**:

If the error term \(u_t\) in the distributed lag model (15.2) is serially correlated, statistical inference that rests on usual (heteroskedasticity-robust) standard errors can be strongly misleading.

**Solution**:

Heteroskedasticity- and autocorrelation-consistent (HAC) estimators of the variance-covariance matrix circumvent this issue. There are `R` functions like `vcovHAC()` from the package `sandwich` which are convenient for computation of such estimators.

`sandwich`also contains the function

`NeweyWest()`, an implementation of the HAC variance-covariance estimator proposed by Newey and West (1987).

Consider the distributed lag regression model with no lags and a single regressor \(X_t\)
\[\begin{align*}
Y_t = \beta_0 + \beta_1 X_t + u_t.
\end{align*}\]
with autocorrelated errors. A brief derivation of
\[\begin{align}
\overset{\sim}{\sigma}^2_{\widehat{\beta}_1} = \widehat{\sigma}^2_{\widehat{\beta}_1} \widehat{f}_t \tag{15.4}
\end{align}\]
the so-called *Newey-West variance estimator* for the variance of the OLS estimator of \(\beta_1\) is presented in Chapter 15.4 of the book. \(\widehat{\sigma}^2_{\widehat{\beta}_1}\) in (15.4) is the heteroskedasticity-robust variance estimate of \(\widehat{\beta}_1\) and
\[\begin{align}
\widehat{f}_t = 1 + 2 \sum_{j=1}^{m-1} \left(\frac{m-j}{m}\right) \overset{\sim}{\rho}_j \tag{15.5}
\end{align}\]
is a correction factor that adjusts for serially correlated errors and involves estimates of \(m-1\) autocorrelation coefficients \(\overset{\sim}{\rho}_j\). As it turns out, using the sample autocorrelation as implemented in `acf()` to estimate the autocorrelation coefficients renders (15.4) inconsistent, see pp. 650-651 of the book for a detailed argument. Therefore, we use a somewhat different estimator. For a time series \(X\) we have \[ \ \overset{\sim}{\rho}_j = \frac{\sum_{t=j+1}^T \hat v_t \hat v_{t-j}}{\sum_{t=1}^T \hat v_t^2}, \ \text{with} \ \hat v= (X_t-\overline{X}) \hat u_t. \] We implement this estimator in the function `acf_c()` below.

\(m\) in (15.5) is a truncation parameter to be chosen. A rule of thumb for choosing \(m\) is \[\begin{align} m = \left \lceil{0.75 \cdot T^{1/3}}\right\rceil. \tag{15.6} \end{align}\]

We simulate a time series that, as stated above, follows a distributed lag model with autocorrelated errors and then show how to compute the Newey-West HAC estimate of \(SE(\widehat{\beta}_1)\) using `R`. This is done via two separate but, as we will see, identical approaches: at first we follow the derivation presented in the book step-by-step and compute the estimate “manually”. We then show that the result is exactly the estimate obtained when using the function `NeweyWest()`.

```
# function that computes rho tilde
function(x, j) {
acf_c <-return(
t(x[-c(1:j)]) %*% na.omit(Lag(x, j)) / t(x) %*% x
)
}
# simulate time series with serially correlated errors
set.seed(1)
100
N <-
arima.sim(n = N, model = list(ma = 0.5))
eps <- runif(N, 1, 10)
X <- 0.5 * X + eps
Y <-
# compute OLS residuals
lm(Y ~ X)$res
res <-
# compute v
(X - mean(X)) * res
v <-
# compute robust estimate of beta_1 variance
1/N * (1/(N-2) * sum((X - mean(X))^2 * res^2) ) /
var_beta_hat <- (1/N * sum((X - mean(X))^2))^2
# rule of thumb truncation parameter
floor(0.75 * N^(1/3))
m <-
# compute correction factor
1 + 2 * sum(
f_hat_T <-- 1:(m-1))/m * sapply(1:(m - 1), function(i) acf_c(x = v, j = i))
(m
)
# compute Newey-West HAC estimate of the standard error
sqrt(var_beta_hat * f_hat_T)
#> [1] 0.04036208
```

For the code to be reusable in other applications, we use `sapply()` to estimate the \(m-1\) autocorrelations \(\overset{\sim}{\rho}_j\).

```
# Using NeweyWest():
NeweyWest(lm(Y ~ X),
NW_VCOV <-lag = m - 1, prewhite = F,
adjust = T)
# compute standard error
sqrt(diag(NW_VCOV))[2]
#> X
#> 0.04036208
```

By choosing `lag = m-1` we ensure that the maximum order of autocorrelations used is \(m-1\) — just as in equation (15.5). Notice that we set the arguments `prewhite = F` and `adjust = T` to ensure that the formula (15.4) is used and finite sample adjustments are made.

We find that the computed standard errors coincide. Of course, a variance-covariance matrix estimate as computed by `NeweyWest()` can be supplied as the argument `vcov` in `coeftest()` such that HAC \(t\)-statistics and \(p\)-values are provided by the latter.

```
lm(Y ~ X)
example_mod <-coeftest(example_mod, vcov = NW_VCOV)
#>
#> t test of coefficients:
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.542310 0.235423 2.3036 0.02336 *
#> X 0.423305 0.040362 10.4877 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

### References

Newey, Whitney K., and Kenneth D. West. 1987. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” *Econometrica* 55 (3): 703–08.