Stefan Wager
Stefan Wager
Interesting -- it looks like that package uses the "old" causal forest. The grf package has a substantially different implementation of causal forests than we had in the first JASA...
Do you have 2 time periods or many? In the case of 2 time periods, the easiest is to just run causal forests as-is on first differences Y_{i,post} - Y_{i,pre}....
The problem you're describing, @hhsievertsen, is a problem that's recently received some attention in the literature; see, e.g., Dahabreh, Issa J., Sarah E. Robertson, Jon A. Steingrimsson, Elizabeth A. Stuart,...
Making the sub-sample size small should help with the memory footprint (since the memory required to store each tree scales with the number of "inbag" observations).
It's `sample.fraction`.
In general, using a smaller sample.fraction should reduce the variance of the forest. This is because reducing the sample fraction increases the implicit "bandwidth" of the forest kernel. The cost...
Yes that's right: setting sample.fraction to 1 uses all the data to train a single (honest) causal tree. The reason we set sample.fraction to 0.5 is that it's the largest...
@zmarkovich yes it would in principle be possible to extend the analysis and methods from the GRF paper to cover the sampling distribution of predictions at multiple points (including the...
Not currently. This seems like a useful thing to have, though.
@ferlocar that sounds like a reasonable way to compress your dataset. GRF should mostly support that workflow: We take the weights into account during prediction, and are actively preparing a...