doubleml-for-py
doubleml-for-py copied to clipboard
Extensions and refinements for the trimming of propensity scores (IRM & IIVM)
Trimming as part of the ML estimation and prediction step
- At the moment the trimming of propensity scores is part of the "score evaluation step", see https://github.com/DoubleML/doubleml-for-py/blob/6ea72e4491bb219fade53b569a199823162fe8d4/doubleml/double_ml_irm.py#L199-L201
- Therefore, the exported predictions in property
predictions
are not yet trimmed. Presumably, it would be more reasonable to make the trimming during the "ML estimation and prediction step". Otherwise users might question whether the trimming really happens.
New trimming rule 'discard'
- Currently, we only have implemented the
trimming_rule
'truncate'
. As alternative, we also want to offer thetrimming_rule
'discard'
. For this we need to find a stable way to exclude observations from subsequent steps. Predictions can obviously just be set tonp.nan
. In subsequent steps these observations need to be excluded. In the repeated cross-fitting case this can then result in different number of observations being evaluated for different random sample splits. At the beginning we might want to prevent these technically challenging cases and only allowtrimming_rule = 'discard'
forn_rep == 1
.