dowhy
dowhy copied to clipboard
CATE: Continuous Outcome and Binary/Categorical Variables
Hi,
I'm working on getting CATE from a dataset with mixed data types. For instance: a binary treatment, two categorical effect modifiers including a binary one, and a continuous outcome.
I started with Linear Regression since the dataset is really simple and the causal relationship is quite straightforward as well. The result is not bad. I did get something interesting.
Here is a small issue. I noticed that dowhy will always try to treat all the variables as continuous whenever possible. For instance, it will qcutmy binary effect modifier and generate five (-0.001,1] intervals. But I prefer to consider True and False as two separate conditions. I checked causal_estimator, and noticed that anything passed is_numeric_dtype() will be qcut.
My question is: is there any way for me to use linear regression to calculate CATE with my original categorical data (skip the qcut step and directly groupby )? Or it can only be done through ML algorithms in EconML or CausalML?
Another side question: In the comment for target_unit, it implies that the parameter can be used as "the condition" for calculating CATE, by passing through a lambda function. I also noticed this usage in the CATE notebook. Could you please explain it a little further?
P.S. I've been using DoWhy for half a year now. Thanks for the great work.
Thank you for raising this @kangqiao-ctrl. Yeah, you are right about the qcut functionality. It should not be happening for binary variables. Let me try to add a fix and update you here.
(thanks for your patience, btw. Was away for the last few weeks.)
what if i want to specified an fixed number of qcut? which part of code should I change or what should I do?