Preface

Prediction v.s. Causal Inference

It is critical to understand the distinctions between prediction and causal inference for anybody who is interested in using machine learning (ML) methods. This is because a method designed to for the former objective may not work well for the latter objective, and vice versa.

Important
  • Prediction: it aims at predicting accurately the level of the variable of interest (the dependent variable) well based on explanatory variables.
  • Causal Inference: it aims at predicting the change in the dependent variable when an explanatory variable of interest changes its value.

Examples where prediction matters:

  • prediction of the future price of corn when the modeler is interested in using the predicted price to make money in the futures market
  • prediction of the crop yield by field when the modeler is interested in using the field-level predicted crop yields as an dependent or explanatory variable in a regression analysis (e.g., the impact of weather on crop yield)
  • prediction of what is in the vicinity of a self-driving car (the user)

What is common among these examples is that the users wants to use the level or state of the dependent variable to drive their decisions.

Examples where causal inference matters:

  • understand the impact of a micro-finance program on welfare in developing countries when the modelers is interested in whether they should implement such a program or not (does the benefit of implementing the program worth the cost?). The modelers do not care about what level of welfare people are gonna be at. They care about how much improvement (change) in welfare the program would make.
  • understand the impact of water use limits for farmers on groundwater usage when the modeler (water managers) are interested in predicting how much water use reduction (change) they can expect.
  • understand the impact of fertilizer on yield for when the modelers are interested in identifying the profit-maximizing fertilizer level. The modelers do not care about what the yield levels are going to be at different fertilizer levels. They care about how much yield improvement (change) can be achieved when more fertilizer is applied.

What is common among these examples is that the users wants to use the information about the change in the dependent variable after changing the value of an explanatory variable (implementing a policy) in driving their decisions.

Now, you may think that once you can predict the level of the dependent variable as a function of explanatory variables \(X\), say \(\hat{f}(X)\), where \(\hat{f}(\cdot)\) is the trained model, then you can simply take the difference in the predicted values of the dependent variable evaluated at \(X\) before (\(X_0\)) and after (\(X_1\)) to find the change in the dependent variable caused by the change in \(X\).

\[ \begin{aligned} \hat{f}(X_1) - \hat{f}(X_0) \end{aligned} \]

You are indeed right and you can predict the change in the dependent variable when the value of an explanatory variable changes once the model is trained to predict the level of the dependent variable. However, this way of predicting the impact of \(X\) (the continuous treatment version of the so-called S-learner) is often biased. Instead, (most of) causal machine learning methods razor-focus on directly estimating the change in the dependent variable when the value of an explanatory variable changes and typically performs better.

Traditionally, the vast majority of ML methods focused on prediction, rather than causal inference. It is only recently (I would say around 2015) that academics and practitioners in industry started to realize the limitation of prediction-oriented methods for many of the research and business problems they need to solve. In response, there are now an emerging tide of new kinds of machine learning methods called causal machine learning methods, which focus on causal identification of a treatment (e.g., pricing of a product, policy intervention, etc). The goal of this book is to learn such causal machine learning methods and add them to your econometric tool box for practical applications. This, however, does not mean we do not learn any prediction-oriented (traditional) machine learning methods. Indeed, it is essential to understand them because the prominent causal machine learning methods do use prediction-oriented ML methods in its process as we will see later. It is just that we do not use prediction-oriented ML methods by themselves for the task of identifying the causal impact of a treatment.

R and Python

  • R reticulate package: can call python functions from within R