EconML
EconML copied to clipboard
Questions regarding DRPolicyForest results
Hi, thanks for the great ci library.
I'm using DRPolicyForest and facing some issue.
model = DRPolicyForest(...)
Question 1: model's predict_value() and predict_proba() are returning different rankings between treatments.
- for example, predict_value() for T2 is higher for some records, but predict_proba() for T1 is higher.
I thought they return the same rankings (order of magnitude), but they aren't. How should I interpret this?
Question 2: model's predict() method returns zero values for Treatment=0 (control). However, if I draw a plot, model.plot(), there is a None Treatment(T=0) leaf with numerous samples.
Why do they return different results?
If you could provide a concrete reproduction of your issues that might help narrow down any issues, but here are a few thoughts.
For question 1, predict_proba gives the fraction of trees in the forest recommending the treatment while predict_value gives the average estimate; usually these would be ranked fairly consistently but that's not necessarily the case. For instance, it's possible that a small number of trees predict a very high value for T2, so that the overall average prediction is higher than that for T1, but the majority of trees predict a higher value for T1, so that gets a higher ranking in predict_proba.
For question 2, if you're plotting a single tree from the forest, then it might make sense that that tree assigns treatment 0 to some instance, but that most trees assign some other treatment, so 0 is never recommended by the forest overall.
Thanks @kbattocchi, I cannot share the code for some issues, but your answer helped a lot ! now I understand the differences btw those methods :) and now I see why they returned somewhat seemingly different results.
(I'll close the issue in a few days in case anyone wants to add some comment )