doubleml-for-py Extensions and refinements for the trimming of propensity scores (IRM & IIVM)

Extensions and refinements for the trimming of propensity scores (IRM & IIVM)

Open MalteKurz opened this issue 3 years ago • 0 comments

At the moment the trimming of propensity scores is part of the "score evaluation step", see https://github.com/DoubleML/doubleml-for-py/blob/6ea72e4491bb219fade53b569a199823162fe8d4/doubleml/double_ml_irm.py#L199-L201
Therefore, the exported predictions in property predictions are not yet trimmed. Presumably, it would be more reasonable to make the trimming during the "ML estimation and prediction step". Otherwise users might question whether the trimming really happens.

Currently, we only have implemented the trimming_rule 'truncate'. As alternative, we also want to offer the trimming_rule 'discard'. For this we need to find a stable way to exclude observations from subsequent steps. Predictions can obviously just be set to np.nan. In subsequent steps these observations need to be excluded. In the repeated cross-fitting case this can then result in different number of observations being evaluated for different random sample splits. At the beginning we might want to prevent these technically challenging cases and only allow trimming_rule = 'discard' for n_rep == 1.

May 25 '21 13:05 MalteKurz