from econml.dml import DML, LinearDML, SparseLinearDML, NonParamDML, CausalForestDML
from econml.metalearners import TLearner, SLearner, XLearner, DomainAdaptationLearner
from econml.sklearn_extensions.model_selection import GridSearchCVList
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier
from sklearn.linear_model import Lasso
from sklearn.model_selection import RepeatedKFold, train_test_split
from numpy.random import binomial, multivariate_normal, normal, uniform
import numpy as np
#=== ignore warnings ===#
import warnings
"ignore") warnings.filterwarnings(
22 Treatment Effect Estimation
This chapter presents CATE estimation using the econml
package (Keith Battocchi 2019). The causalml
package by Uber (Chen et al. 2020) is less complete than econml
at the moment, and we do not cover it.
Let’s also generate synthetic dataset using make_regression()
.
# Define DGP
def generate_data(n, d, controls_outcome, treatment_effect, propensity):
"""Generates population data for given untreated_outcome, treatment_effect and propensity functions.
Parameters
----------
n (int): population size
d (int): number of covariates
controls_outcome (func): untreated outcome conditional on covariates
treatment_effect (func): treatment effect conditional on covariates
propensity (func): probability of treatment conditional on covariates
"""
# Generate covariates
= multivariate_normal(np.zeros(d), np.diag(np.ones(d)), n)
X # Generate treatment
= np.apply_along_axis(lambda x: binomial(1, propensity(x), 1)[0], 1, X)
T # Calculate outcome
= np.apply_along_axis(lambda x: controls_outcome(x), 1, X)
Y0 = np.apply_along_axis(lambda x: treatment_effect(x), 1, X)
treat_effect = Y0 + treat_effect * T
Y return (Y, T, X)
# controls outcome, treatment effect, propensity definitions
def generate_controls_outcome(d):
= uniform(-3, 3, d)
beta return lambda x: np.dot(x, beta) + normal(0, 1)
= lambda x: (1 if x[1] > 0.1 else 0)*8
treatment_effect = lambda x: (0.8 if (x[2]>-0.5 and x[2]<0.5) else 0.2) propensity
# DGP constants and test data
= 5
d = 1000
n = 250
n_test = generate_controls_outcome(d)
controls_outcome = multivariate_normal(np.zeros(d), np.diag(np.ones(d)), n_test)
X_test = 6/n_test
delta 1] = np.arange(-3, 3, delta) X_test[:,
= generate_data(n, d, controls_outcome, treatment_effect, propensity) Y, T, X
22.1 Average Treatment Effect
DoubleML
import DoubleML
22.2 S-, X-, and T-learner
This section shows how to train S-, X-, and T-learner. See Chapter 12 for how these learners work, which would help you understand what you need to specify for each of the learners.
22.2.1 S-learner
To train an S-learner, you need to specify only one estimator, which estimates \(E[Y|T, X, W]\). This can be done using overall_model
in SLearner
.
#=== specify the overall model ===#
= GradientBoostingRegressor(
overall_model =100,
n_estimators=6,
max_depth=10
min_samples_leaf
)
#=== set up an S-learner ===#
= SLearner(overall_model=overall_model)
S_learner
#=== train ===#
=X) S_learner.fit(Y, T, X
<econml.metalearners._metalearners.SLearner object at 0x2fc2c2f10>
Estimate \(\theta(X)\) using the effect
method,
= S_learner.effect(X_test)
S_te
#=== see the first 10 ===#
print(S_te[:10])
[-0.96345437 0.49912816 0.32140746 0.32493562 0.52117931 0.39061349
1.12985202 0.61109233 -0.05638891 0.74819205]
22.2.2 T-learner
To train a T-learner, you need to specify only one estimator, which estimates \(E[Y|T=1, X, W]\) and \(E[Y|T=0, X, W]\) sparately. This can be done using models
in TLearner
.
#=== set up an estimator ===#
= GradientBoostingRegressor(
models =100,
n_estimators=6,
max_depth=10
min_samples_leaf
)
#=== set up a T-learner ===#
= TLearner(models=models)
T_learner
#=== train ===#
=X) T_learner.fit(Y, T, X
<econml.metalearners._metalearners.TLearner object at 0x2fc3bedc0>
Estimate \(\theta(X)\) using the effect
method,
= T_learner.effect(X_test)
T_te
#=== see the first 10 ===#
print(T_te[:10])
[1.52115791 2.06799171 0.64311159 1.39796878 1.1590299 1.47958283
2.18855353 1.82882139 1.65287604 2.99781446]
22.2.3 X-learner
To train an X-learner, you need to specify two estimators
- Estimator that estimates \(E[Y|T=1, X, W]\) and \(E[Y|T=0, X, W]\) sparately just like T-learner.
- Estimator that estimates \(E[T|X, W]\) for weighting by propensity score
This can be done using models
(for the first) and propensity_model
(for the second) in XLearner
.
#=== set up an estimator for 1 ===#
= GradientBoostingRegressor(
models =100,
n_estimators=6,
max_depth=10
min_samples_leaf
)#=== set up a propensity model ===#
= RandomForestClassifier(
propensity_model =100,
n_estimators=6,
max_depth=10
min_samples_leaf
)
#=== set up an X-learner ===#
= XLearner(models=models, propensity_model=propensity_model)
X_learner
#=== train ===#
=X) X_learner.fit(Y, T, X
<econml.metalearners._metalearners.XLearner object at 0x2fc3dd460>
Estimate \(\theta(X)\) using the effect
method,
= X_learner.effect(X_test)
X_te
#=== see the first 10 ===#
print(X_te[:10])
[ 0.88547092 2.02973059 0.55806609 -0.98178738 0.37334261 0.97004328
0.79535256 0.38061293 -0.31194151 1.20117012]
22.3 R-learner
This section shows how to run various estimators that fall under R-learner, which is referred to as the _Rlearner
class in econml
. As we saw in Chapter 12, R-learner is a DML and econml
offers many estimators under _Rlearner
.
DML
LinearDML
SparseLinearDML
KernelDML
NonParamDML
CausalForestDML
All the estimators under _Rlearner
require that estimators for \(E[Y|X]\) and \(E[T|X]\) are specified. This can be done by model_y
for \(E[Y|X]\) and model_t
\(E[T|X]\). However, some estimators require that you specify the final (second stage) model using model_final
while others do not.
22.3.1 DML
In this example, let’s use gradient boosting regression for both model_y
and model_t
and use lasso with cross-validation for model_final
. Let’s import GradientBoostingRegressor()
and LassoCV()
from the scikitlearn
package.
22.3.2 LinearDML
LinearDML
is a DML estimator that uses unregularlized linear model in the second stage. So, it assumes that \(\theta(X)\) can be written as follows in Equation 12.1:
\[ \begin{aligned} \theta(X) = \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_k x_k \end{aligned} \]
It solves the following minimization problem,
\[ \begin{aligned} argmin_{\beta_1, \dots, \beta_k} \sum_{i=1}^N (\tilde{Y}_i - \beta_1 x_1\cdot \tilde{T}- \beta_2 x_2\cdot \tilde{T}, \dots, - \beta_K x_K\cdot \tilde{T})^2 \end{aligned} \]
This can be solved by simply regressing \(\tilde{Y}_i\) on \(x_1\cdot \tilde{T}\) through \(x_K\cdot \tilde{T}\). Once \(\beta_1, \dots,\beta_K\) are estimated, then \(\hat{\theta}(X)\) is
\[ \begin{aligned} \hat{\theta}(X) = \hat{\beta}_1 x_1 + \hat{\beta}_2 x_2 + \dots + \hat{\beta}_k x_k \end{aligned} \]
So, it is easy to interpret how \(X\) affects treatment effects using LinearDML
. This estimator should be used only when there are small numbers of heterogeneity drivers, \(X\).
Since LinearDML
runs a linear-in-parameter model without regularization, you do not need to specify the estimator for the final stage. We use GradientBoostingRegressor()
for model_y
and GradientBoostingClassifier()
for model_t
. Let’s set up our LinearDML
,
= LinearDML(
est = GradientBoostingRegressor(),
model_y = GradientBoostingClassifier()
model_t )
We now invoke the fit
method. Here, \(W=X\).
= X, W = X) est.fit(Y, T, X
<econml.dml.dml.LinearDML object at 0x2fc410cd0>
Predict \(\theta(X)\) for X_test
.
= est.effect(X_test)
ldml_te
print(ldml_te[:5])
[-3.72248997 -3.01852346 -3.47501606 -3.59183133 -3.48650655]
22.3.3 NonParamDML
As the name suggests, it runs non-parametric regression (e.g., reandom forest) at the second stage. Unlike LinearDML
, we need to specify model_final
. Internally, it solves the following problem:
\[ \begin{aligned} \sum_{i=1}^N \tilde{T}_i^2(\frac{\tilde{Y}_i}{\tilde{T}_i} - \theta(X)) = 0 \end{aligned} \]
The estimator specified for model_final
regress \(\frac{\tilde{Y}_i}{\tilde{T}_i}\) and \(X_i\) with sample weight of \(\tilde{T}_i^2\).
Let’s use GradientBoostingRegressor()
as the final model.
= NonParamDML(
est = GradientBoostingRegressor(),
model_y = GradientBoostingClassifier(),
model_t = GradientBoostingRegressor()
model_final )
We now invoke the fit
method. Here, \(W=X\).
= X, W = X) est.fit(Y, T, X
<econml.dml.dml.NonParamDML object at 0x2fc459fd0>
Predict \(\theta(X)\) for X_test
.
= est.effect(X_test)
ldml_te
print(ldml_te[:5])
[0.66405196 2.14662201 0.45083237 1.34571212 0.88009555]