This book is in Open Review. We want your feedback to make the book better for you and other students. You may annotate some text by selecting it with the cursor and then click "Annotate" in the pop-up menu. You can also see the annotations of others: click the arrow in the upper right hand corner of the page

1 Introduction


The interest in the freely available statistical programming language and software environment R (R Core Team 2023) is soaring. By the time we wrote first drafts for this project, more than 11000 add-ons (many of them providing cutting-edge methods) were made available on the Comprehensive R Archive Network (CRAN), an extensive network of FTP servers around the world that store identical and up-to-date versions of R code and its documentation. R dominates over other (commercial) software for statistical computing in most fields of research in applied statistics. The benefits of it being freely available, open source and having a large and constantly growing community of users that contribute to CRAN render R more and more appealing for empirical economists and econometricians alike.

A striking advantage of using R in econometrics is that it enables students to explicitly document their analysis step-by-step such that it is easy to update and to expand. This allows to re-use code for similar applications with different data. Furthermore, R programs are fully reproducible, which makes it straightforward for others to comprehend and validate results.

Over the recent years, R has thus become an integral part of the curricula of econometrics classes we teach at the University of Duisburg-Essen. In some sense, learning to code is comparable to learning a foreign language and continuous practice is essential for the learning success. Needless to say, presenting bare R code on slides does not encourage the students to engage with hands-on experience on their own. This is why R is crucial. As for accompanying literature, there are some excellent books that deal with R and its applications to econometrics, e.g., C. Kleiber and Zeileis (2008). However, such sources may be somewhat beyond the scope of undergraduate students in economics having little understanding of econometric methods and barely any experience in programming at all. Consequently, we started to compile a collection of reproducible reports for use in class. These reports provide guidance on how to implement selected applications from the textbook Introduction to Econometrics (Stock and Watson 2015) which serves as a basis for the lecture and the accompanying tutorials. This process was facilitated considerably by knitr (Xie 2023b) and R markdown (Allaire et al. 2023). In conjunction, both R packages provide powerful functionalities for dynamic report generation which allow to seamlessly combine pure text, LaTeX, R code and its output in a variety of formats, including PDF and HTML. Moreover, writing and distributing reproducible reports for use in academia has been enriched tremendously by the bookdown package (Xie 2023a) which has become our main tool for this project. bookdown builds on top of R markdown and allows to create appealing HTML pages like this one, among other things. Being inspired by Using R for Introductory Econometrics (Heiss 2016)1 and with this powerful toolkit at hand we wrote up our own empirical companion to Stock and Watson (2015). The result, which you started to look at, is Introduction to Econometrics with R.

Similarly to the book by Heiss (2016), this project is neither a comprehensive econometrics textbook nor is it intended to be a general introduction to R. We feel that Stock and Watson did a great job at explaining the intuition and theory of econometrics, and at any rate better than we could in yet another introductory textbook! Introduction to Econometrics with R is best described as an interactive script in the style of a reproducible research report which aims to provide students with a platform-independent e-learning arrangement by seamlessly intertwining theoretical core knowledge and empirical skills in undergraduate econometrics. Of course, the focus is on empirical applications with R. We leave out derivations and proofs wherever we can. Our goal is to enable students not only to learn how results of case studies can be replicated with R but we also intend to strengthen their ability in using the newly acquired skills in other empirical applications — immediately within Introduction to Econometrics with R.

To realize this, each chapter contains interactive R programming exercises. These exercises are used as supplements to code chunks that display how previously discussed techniques can be implemented within R. They are generated using the DataCamp light widget and are backed by an R session which is maintained on DataCamp’s servers. You may play around with the example exercise presented below.

As you can see above, the widget consists of two tabs. script.R mimics an .R-file, a file format that is commonly used for storing R code. Lines starting with a # are commented out, that is, they are not recognized as code. Furthermore, script.R works like an exercise sheet where you may write down the solution you come up with. If you hit the button Run, the code will be executed, submission correctness tests are run and you will be notified whether your approach is correct. If it is not correct, you will receive feedback suggesting improvements or hints. The other tab, R Console, is a fully functional R console that can be used for trying out solutions to exercises before submitting them. Of course you may submit (almost any) R code and use the console to play around and explore. Simply type a command and hit the Enter key on your keyboard.

Looking at the widget above, you will notice that there is a > in the right panel (in the console). This symbol is called “prompt” and indicates that the user can enter code that will be executed. To avoid confusion, we will not show this symbol in this book. Output produced by R code is commented out with #>.

Most commonly we display R code together with the generated output in code chunks. As an example, consider the following line of code presented in chunk below. It tells R to compute the number of packages available on CRAN. The code chunk is followed by the output produced.

# check the number of R packages available on CRAN
nrow(available.packages(repos = "http://cran.us.r-project.org"))
#> [1] 20396

Each code chunk is equipped with a button on the outer right hand side which copies the code to your clipboard. This makes it convenient to work with larger code segments in your version of R/RStudio or in the widgets presented throughout the book. In the widget above, you may click on R Console and type nrow(available.packages(repos = "http://cran.us.r-project.org")) (the command from the code chunk above) and execute it by hitting Enter on your keyboard.2

Note that some lines in the widget are out-commented which ask you to assign a numeric value to a variable and then to print the variable’s content to the console. You may enter your solution approach to script.R and hit the button Run in order to get the feedback described further above. In case you do not know how to solve this sample exercise (don’t panic, that is probably why you are reading this), a click on Hint will provide you with some advice. If you still can’t find a solution, a click on Solution will provide you with another tab, Solution.R which contains sample solution code. It will often be the case that exercises can be solved in many different ways and Solution.R will present what we consider as comprehensible and idiomatic.

References

Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2023. rmarkdown: Dynamic Documents for R (version 2.25). https://github.com/rstudio/rmarkdown.
Heiss, Florian. 2016. Using R for Introductory Econometrics. CreateSpace Independent Publishing Platform. http://www.urfie.net/.
Kleiber, C., and A. Zeileis. 2008. Applied Econometrics with R. Springer.
R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Stock, J. H., and M. W. Watson. 2015. Introduction to Econometrics, Third Update, Global Edition. Pearson Education Limited.
Wooldridge, Jeffrey. 2016. Introductory Econometrics. Sixth. Cengage Learning.
Xie, Yihui. 2023a. bookdown: Authoring Books and Technical Documents with R Markdown (version 0.36). https://github.com/rstudio/bookdown.
———. 2023b. knitr: A General-Purpose Package for Dynamic Report Generation in R (version 1.45). https://yihui.org/knitr/.

  1. Heiss (2016) builds on the popular Introductory Econometrics (Wooldridge 2016) and demonstrates how to replicate the applications discussed therein using R.↩︎

  2. The R session is initialized by clicking into the widget. This might take a few seconds. Just wait for the indicator next to the button Run to turn green.↩︎