doubleml-for-py icon indicating copy to clipboard operation
doubleml-for-py copied to clipboard

Extensions and refinements for the trimming of propensity scores (IRM & IIVM)

Open MalteKurz opened this issue 3 years ago • 0 comments

Trimming as part of the ML estimation and prediction step

  • At the moment the trimming of propensity scores is part of the "score evaluation step", see https://github.com/DoubleML/doubleml-for-py/blob/6ea72e4491bb219fade53b569a199823162fe8d4/doubleml/double_ml_irm.py#L199-L201
  • Therefore, the exported predictions in property predictions are not yet trimmed. Presumably, it would be more reasonable to make the trimming during the "ML estimation and prediction step". Otherwise users might question whether the trimming really happens.

New trimming rule 'discard'

  • Currently, we only have implemented the trimming_rule 'truncate'. As alternative, we also want to offer the trimming_rule 'discard'. For this we need to find a stable way to exclude observations from subsequent steps. Predictions can obviously just be set to np.nan. In subsequent steps these observations need to be excluded. In the repeated cross-fitting case this can then result in different number of observations being evaluated for different random sample splits. At the beginning we might want to prevent these technically challenging cases and only allow trimming_rule = 'discard' for n_rep == 1.

MalteKurz avatar May 25 '21 13:05 MalteKurz