justcause
justcause copied to clipboard
example in the documentation is bad practice
The example in the documentation is bad practice as the output of the linear model is constant (underfit) https://justcause.readthedocs.io/en/latest/
>>> from justcause.data.sets import load_ihdp
>>> from justcause.learners import SLearner
>>> from justcause.learners.propensity import estimate_propensities
>>> from justcause.metrics import pehe_score, mean_absolute
>>> from justcause.evaluation import calc_scores
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import LinearRegression
>>> import pandas as pd
>>> replications = load_ihdp(select_rep=[0, 1, 2])
>>> slearner = SLearner(LinearRegression())
>>> metrics = [pehe_score, mean_absolute]
>>> scores = []
>>> for rep in replications:
>>> train, test = train_test_split(rep, train_size=0.8)
>>> p = estimate_propensities(train.np.X, train.np.t)
>>> slearner.fit(train.np.X, train.np.t, train.np.y, weights=1/p)
>>> pred_ite = slearner.predict_ite(test.np.X, test.np.t, test.np.y)
>>> scores.append(calc_scores(test.np.ite, pred_ite, metrics))
>>> pd.DataFrame(scores)
pehe_score mean_absolute
0 0.998388 0.149710
1 0.790441 0.119423
2 0.894113 0.151275
When one looks at pred_ite the standard deviation is almost zero. The predictive power of the model is practically zero. Thus, the example should either include some relative evaluation relative to the dummy model (e.g. constant).
pred_ite.std()
1.130466570252318e-15
there is also a bug. the weight should be t/p + (1-t)/(1-p) rather than just 1/p
Hey, thanks for reporting the issue!
Unfortunately, both me and @FlorianWilhelm are currently not working on the package anymore. However, if you have a solution for the above bug (by adjusting the weights) feel free to open a pull request with the changes to the docs..
Do you have a source for the new configuration of weights?
Best, Max