policytree icon indicating copy to clipboard operation
policytree copied to clipboard

Recommended heuristic for integrating "policy tree" with "honest causal forest"

Open njawadekar opened this issue 3 years ago • 5 comments

The below post is more of a methodological question than a technical one.

Based on what I've gathered, the honest causal forest and policy tree are two distinct yet related methods. Both can evidently yield actionable insights on the effects of treatment within a heterogeneous population. However,

  • The honest causal forest can estimate conditional average treatment effects across non-prespecified and heterogeneous subgroups
  • Whereas, the policy tree can identify a data-driven optimal treatment rule for a given sample based on the observed data

So, while the honest causal forest is a bit more exhaustive (since it estimates pretty granular subgroup-specific causal estimates), the policy tree provides a bit more of "broad brush" strategy to these heterogeneities by identifying an optimal treatment rule that can be applied to a population for making treatment decisions.

Question: Given that both the honest causal forest and policy tree address similar research objectives (i.e., to help understand heterogeneities that exist in a population, so that we can make better decisions), has your research group developed any standard protocols or heuristics for incorporating the results of the honest causal forest into the inputs of the policy tree model? For example, would it be reasonable to develop a protocol whereby we only input covariates into the policy tree that were listed among the top 10% of the most "Important" variables for heterogeneities within the honest causal forest, or something like that?

njawadekar avatar Nov 18 '21 01:11 njawadekar

Hi @njawadekar

For example, would it be reasonable to develop a protocol whereby we only input covariates into the policy tree that were listed among the top 10% of the most "Important" variables for heterogeneities within the honest causal forest

Yes, that's a perfectly fine heuristic and is suggested here https://github.com/grf-labs/policytree/issues/46 in order to make a setting with many covariates feasible for policy_tree.

Our research group (@halflearned) has been working on an online tutorial for ML-based HTE estimation, you might find the section on policy learning useful: https://bookdown.org/stanfordgsbsilab/tutorial/policy-learning-i-binary-treatment.html

erikcs avatar Nov 18 '21 22:11 erikcs

Also, if you're looking for a real-world empirical application, @hhsievertsen has a paper using causal forest + policy tree here https://github.com/hhsievertsen/hhsievertsen.github.io/raw/master/mat/wp/chx_sep2021.pdf

erikcs avatar Nov 26 '21 04:11 erikcs