CausalPy icon indicating copy to clipboard operation
CausalPy copied to clipboard

regression discontinuity: allow the treatment to be `>=` or `<=` the threshold

Open drbenvincent opened this issue 2 years ago • 1 comments

At the moment, the assumption is that the units above the threshold are treated. But this absolutely is not always going to be true. So we need to allow for this.

Option 1: Setting a threshold_function='<=' or threshold_function='>=' Option 2: allow users to use a kwarg where they can override a function. Eg. threshold_function=np.greater_equal or threshold_function =np.less_equal

Do this on the synthetic regression discontinuity datasets, for both PyMC and skl. Append it as another analysis example.

Things to think about:

  • Helper function _is_treated uses np.greater_equal
  • We have a treated column in the dataset. This presents some redundancy because all we need is the running variable and the _is_treated helper function. That function is there because we need a way of working out which data are treated when we interpolate for xpred. One solution would be to remove treated as a column of data and instead derive this from the running variable and _is_treated. However, the treated still needs to appear in the model formula. So would have to add some explanatory text in notebooks.
  • The order of comparison to calculate discontinuity_at_threshold
  • Would be a good idea opportunity to add some input validation for RD (see #78)
  • Update the integration tests

[Optional] Do we want to add in a shaded region above/below the treatment threshold?

drbenvincent avatar Oct 21 '22 16:10 drbenvincent