dowhy icon indicating copy to clipboard operation
dowhy copied to clipboard

graph learning and R depedency on CDT

Open priamai opened this issue 2 years ago • 9 comments

In the documentation the causal learning depends on an external library CDT:

  • https://github.com/py-why/dowhy/blob/main/docs/source/example_notebooks/dowhy_causal_discovery_example.ipynb
  • https://www.pywhy.org/dowhy/v0.10/example_notebooks/dowhy_causal_discovery_example.html

The CDT library is quite cumbersome to install since it relies on the R environment and a bunch of R libraries. In the documentation I can see this:

https://www.pywhy.org/dowhy/v0.10/dowhy.graph_learners.html

which seems to be a wrapper still on CDT.

Are there are any plans to implement a full python module and avoid using R?

priamai avatar Aug 23 '23 07:08 priamai

I can see that for example the GES sampler is both available via the python pure package and in the CDT package.

priamai avatar Aug 23 '23 07:08 priamai

Hi @priamai , thanks for the question. Causal-Learn, another of the pywhy libraries, provides a broad set of causal discovery methods. Does that provide what you are looking for?

https://github.com/py-why/causal-learn

It would be great to update that sample notebook to use causal-learn instead of CDT. It is an old example.

emrekiciman avatar Aug 23 '23 14:08 emrekiciman

ah yes, we should update that example and use causal-learn.

@kunwuz would you like to update this notebook and add causal-learn algorithms?

amit-sharma avatar Aug 24 '23 07:08 amit-sharma

That will be awesome!

priamai avatar Aug 24 '23 14:08 priamai

Aha yes, I will update this notebook soon. At the same time, please feel free to let me know if you have any questions using causal-learn.

kunwuz avatar Aug 24 '23 15:08 kunwuz

I have one question @kunwuz, I notice there are also conditional independence test which we also have in DoWhy correct? This link: https://www.cmu.edu/dietrich/causality/ seems to timeout for me, can you access it (I managed it took ages to load)? They have some really good datasets here: https://github.com/cmu-phil/example-causal-datasets Would be great to show how to run the discovery on each one?

priamai avatar Aug 24 '23 20:08 priamai

Can you access this link? https://www.cmu.edu/dietrich/causality/projects/causal_learn_benchmarks/ It works from my side. And yes, cmu-phil has a lot of well-maintained/processed datasets, and I will try to write up a tutorial on applying discovery methods on them. Of course, I will let you know when I finish it.

kunwuz avatar Aug 24 '23 20:08 kunwuz

Causal-learn has a series of conditional independence tests, including fisher-z, chi-square, kernel-based tests, and others. The kernel-based tests have also been used in dowhy-gcm. In a recent package called pywhy-stat, a more complete collection of tests has been integrated, although still in process.

kunwuz avatar Aug 24 '23 20:08 kunwuz

Hello yes, I am building a graph itself to document all the various frameworks with an initial taxonomy, it would be nice if you could contribute to it here I can make you an author.

With causa-learn one of the main issues to use with pandas data frames is that it requires numpy and thus requires to keep some form of mapping between the column names.

Here's a practical example:

image

It would be nice if you could facilitate this kind of translation.

priamai avatar Sep 04 '23 10:09 priamai