**Open Review**. We want your feedback to make the book better for you and other students. You may annotate some text by selecting it with the cursor and then click "Annotate" in the pop-up menu. You can also see the annotations of others: click the arrow in the upper right hand corner of the page

## 4.6 Exercises

#### 1. Class Sizes and Test Scores

A researcher wants to analyze the relationship between class size (measured by the student-teacher ratio) and the average test score. Therefore he measures both variables in \(10\) different classes and ends up with the following results.

Class Size |
23 | 19 | 30 | 22 | 23 | 29 | 35 | 36 | 33 | 25 |

Test Score |
430 | 430 | 333 | 410 | 390 | 377 | 325 | 310 | 328 | 375 |

**Instructions:**

Create the vectors

`cs`(the class size) and`ts`(the test score), containing the observations above.Draw a scatterplot of the results using

`plot()`.

#### 2. Mean, Variance, Covariance and Correlation

The vectors `cs` and `ts` are available in the working environment (you can check this: type their names into the console and press enter).

**Instructions:**

Compute the mean, the sample variance and the sample standard deviation of

`ts`.Compute the covariance and the correlation coefficient for

`ts`and`cs`.

**Hint:** Use the `R` functions presented in this chapter: `mean()`, `sd()`, `cov()`, `cor()` and `var()`.

#### 3. Simple Linear Regression

The vectors `cs` and `ts` are available in the working environment.

**Instructions:**

The function

`lm()`is part of the package`AER`. Attach the package using`library()`.Use

`lm()`to estimate the regression model \[TestScore_i = \beta_0 + \beta_1 STR_i + u_i.\] Assign the result to`mod`.Obtain a statistical summary of the model.

#### 4. The Model Object

Let us see how an object of class `lm` is structured.

The vectors `cs` and `ts` as well as the model object `mod` from the previous exercise are available in your workspace.

**Instructions:**

- Use
`class()`to learn about the class of the object`mod`. `mod`is an object of type`list`with named entries. Check this using the function`is.list()`.- See what information you can obtain from
`mod`using`names()`. - Read out an arbitrary entry of the object
`mod`using the`$`operator.

#### 5. Plotting the Regression Line

You are provided with the code for the scatterplot in `script.R`

**Instructions:**

Add the regression line to the scatterplot from a few exercises before.

The object

`mod`is available in your working environment.

**Hint:** Use the function `abline()`.

#### 6. Summary of a Model Object

Now read out and store some of the information that is contained in the output of `summary()`.

**Instructions:**

Assign the output of

`summary(mod)`to the variable`s`.Check entry names of the object

`s`.Create a new variable

`R2`and assign the \(R^2\) of the regression.

The object `mod` is available in your working environment.

#### 7. Estimated Coefficients

The function `summary()` also provides information on the statistical significance of the estimated coefficients.

**Instructions:**

Extract the named \(2\times4\) matrix with estimated coefficients, standard errors, \(t\)-statistics and corresponding \(p\)-values from the model summary `s`. Save this matrix in an object named `coefs`.

The objects `mod` and `s` are available in your working environment.

#### 8. Dropping the Intercept

So far, we have estimated regression models consisting of an intercept and a single regressor. In this exercise you will learn how to specify and how to estimate regression a model without intercept.

Note that excluding the intercept from a regression model might be a dodgy practice in some applications as this imposes the conditional expectation function of the dependent variable to be zero if the regressor is zero.

**Instructions:**

Figure out how the

`formula`argument must be specified for a regression of`ts`solely on`cs`, i.e., a regression without intercept. Google is your friend!Estimate the regression model without intercept and store the result in

`mod_ni`.

The vectors `cs`, `ts` and the model object `mod` from previous exercises are available in the working environment.

#### 9. Regression Output: No Constant Case

In Exercise 8 you have estimated a model without intercept. The estimated regression function is

\[\widehat{TestScore} = \underset{(1.36)}{12.65} \times STR.\]

**Instructions:**

Convince yourself that everything is as stated above: extract the coefficient matrix from the summary of `mod_ni` and store it in a variable named `coef`.

The vectors `cs`, `ts` as well as the model object `mod_ni` from the previous exercise are available in your working environment.

**Hint:** An entry of a named list can be accessed using the `$` operator.

#### 10. Regression Output: No Constant Case — Ctd.

In Exercises 8 and 9 you have dealt with a model without intercept. The estimated regression function was

\[\widehat{TestScore_i} = \underset{(1.36)}{12.65} \times STR_i.\]

The coefficient matrix `coef` from Exercise 9 contains the estimated coefficient on \(STR\), its standard error, the \(t\)-statistic of the significance test and the corresponding \(p\)-value.

**Instructions:**

- Print the contents of
`coef`to the console. - Convince yourself that the reported \(t\)-statistic is correct: use the entries of
`coef`to compute the \(t\)-statistic and save it to`t_stat`.

The matrix `coef` from the previous exercise is available in your working environment.

**Hints:**

`X[a,b]`returns the`[a,b]`element of the matrix`X`.The \(t\)-statistic for a test of the hypothesis \(H_0: \beta_1 = 0\) is computed as \[t = \frac{\hat{\beta}_1}{SE(\hat{\beta}_1)}.\]

#### 11. Two Regressions, One Plot

The two estimated regression models from the previous exercises are

\[\widehat{TestScore_i} = \underset{(1.36)}{12.65} \times STR_i\]

and

\[\widehat{TestScore_i} = \underset{(23.96)}{567.4272} \underset{(0.85)}{-7.1501} \times STR_i.\]

You are provided with the code line `plot(cs, ts)` which creates a scatterplot of `ts` and `cs`. Note that this line must be executed before calling `abline()`! You may color the regression lines by using, e.g., `col = “red”` or `col = “blue”` as an additional argument to `abline()` for better distinguishability.

The vectors `cs` and `ts` as well as the list objects `mod` and `mod_ni` from previous exercises are available in your working environment.

**Instructions:**

Generate a scatterplot of `ts` and `cs` and add the estimated regression lines of `mod` and `mod_ni`.

#### 12. \(TSS\) and \(SSR\)

If graphical inspection does not help, researchers resort to analytic techniques in order to detect if a model fits the data at hand well or better than another model.

Let us go back to the simple regression model including an intercept. The estimated regression line for `mod` was

\[\widehat{TestScore_i} = 567.43 - 7.15 \times STR_i, \, R^2 = 0.8976, \, SER=15.19.\]

You can check this as `mod` and the vectors `cs` and `ts` are available in your working environment.

**Instructions:**

- Compute \(SSR\), the sum of squared residuals, and save it to
`ssr`. - Compute \(TSS\), the total sum of squares, and save it to
`tss`.

#### 13. The \(R^2\) of a Regression Model

The \(R^2\) of the regression saved in `mod` is \(0.8976\). You can check this by executing `summary(mod)$r.squared` in the console below.

Remember the formula of \(R^2\):

\[R^2 = \frac{ESS}{TSS} = 1 - \frac{SSR}{TSS}\]

The objects `mod`, `tss` and `ssr` from the previous exercise are available in your working environment.

**Instructions:**

- Use
`ssr`and`tss`to compute \(R^2\) manually.*Round*the result to*four*decimal places and save it to`R2`. - Use the logical operator
`==`to check whether your result matches the value mentioned above.

**Hints:**

You may round numeric values using the function `round()`.

#### 14. The Standard Error of The Regression

The standard error of the Regression in the simple regression model is \[SER = \frac{1}{n-2} \sum_{i=1}^n \widehat{u}_i^2 =\sqrt{\frac{SSR}{n-2}}.\] \(SER\) measures the size of an average residual which is an estimate of the magnitude of a typical regression error.

The model object `mod` and the vectors `cs` and ts are available in your workspace.

**Instructions:**

Use

`summary()`to obtain the \(SER\) for the regression of`ts`on`cs`saved in the model object`mod`. Save the result in the variable`SER`.Use

`SER`to compute the \(SSR\) and store it in`SSR`.Check that

`SSR`is indeed the \(SSR\) by comparing`SSR`to the result of`sum(mod$residuals^2)`

#### 15. The Estimated Covariance Matrix

As has been discussed in Chapter 4.4, the OLS estimators \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\) are functions of the random error term. Therefore, they are random variables themselves. For two or more random variables, their covariances and variances are summarized by a *variance-covariance matrix* (which is often simply called the *covariance matrix*). Taking the square root of the diagonal elements of the estimated covariance matrix obtains \(SE(\widehat\beta_0)\) and \(SE(\widehat\beta_1)\), the standard errors of \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\).

`summary()` computes an estimate of this matrix. The respective entry in the output of summary (remember that `summary()` produces a list) is called `cov.unscaled`. The model object `mod` is available in your workspace.

**Instructions:**

Use

`summary()`to obtain the covariance matrix estimate for the regression of test scores on student-teacher ratios stored in the model object`mod`. Save the result to`cov_matrix`.Obtain the diagonal elements of

`cov_matrix`, compute their square root and assign the result to the variable`SEs`.

**Hint:** `diag(A)` returns a vector containing the diagonal elements of the matrix `A`.