CausalDiscoveryToolbox icon indicating copy to clipboard operation
CausalDiscoveryToolbox copied to clipboard

How can I generate a graph from a data without using cdt.data.AcyclicGraphGenerator

Open MAX00008888 opened this issue 2 years ago • 10 comments

MAX00008888 avatar Apr 02 '22 08:04 MAX00008888

@Diviyan-Kalainathan , sorry to bother you. I don't understand the difference between create_graph_from_data(data, **kwargs) and orient_directed_graph(data, dag, **kwargs)``predict(df_data, graph=None, **kwargs). How can I generate a graph like this waydata, graph = generator.generate().

MAX00008888 avatar Apr 02 '22 08:04 MAX00008888

how to get a graph contains the ground truth of the dataset @Diviyan-Kalainathan

MAX00008888 avatar Apr 02 '22 08:04 MAX00008888

Hello @MAX00008888,

There are two distinct things :

  • cdt.data that generates/provides data with the ground truth
  • cdt.causality methods, which goal is to find the ground truth graph that generated the data

Actually, predict regroups multiple functions for causal discovery methods, one that cover each case, and runs for each a different function:

  • The user provides only the data (we have to produce a graph from scratch : create_graph_from_data()
  • The user provides the data and the skeleton of the graph (undirected graph) : orient_undirected_graph()
  • The user provides the data and a (even partially-) directed graph : orient_directed_graph

Hope this helps, Diviyan

diviyank avatar Apr 02 '22 08:04 diviyank

Thank you soooo much for your relpy @Diviyan-Kalainathan ,

I'm still a little confused. Here is my code: data= pd.DataFrame(batch.next_observations.cpu().numpy(), columns=cols) model = GES() graph = model.create_graph_from_data(data) Is this get the same kind of graph like data, graph = generator.generate().

Thank again!

MAX00008888 avatar Apr 02 '22 09:04 MAX00008888

Yes, GES aims at producting the same graph as the one that generated your data (here batch.next_observations.cpu().numpy()) ; it tries to find causal relationships between the columns of your data.

Best, Diviyan

P.S. : be warned however, GES aims to produce the ground truth of your data and therefore the results are not always 100% correct; compared to the AcyclicGraphGenerator (that actually generates data according to the graph), the causal methods do not provide the true answer (it's their goal, but it's the whole point of the research)

diviyank avatar Apr 02 '22 12:04 diviyank

Thank you so much ! !!@Diviyan-Kalainathan It may sound silly , but I would like to know if there is anyway to get the true graph of data.

Thanks

MAX00008888 avatar Apr 02 '22 12:04 MAX00008888

Hi Diviyan, @Diviyan-Kalainathan

I want to know if there is a score function like BIC score to get the score of the graph.

Best,

MAX00008888 avatar Apr 03 '22 06:04 MAX00008888

Hi,

Sadly, the causal methods try to find the true graph of the data (or the markov equivalence class - check out CPDAG), and more often than not, the assumptions made by the used models are violated. Thus the approximate result has not optimal guarantees on the validity of the proposed solution.

We don''t have a score function such as BIC in the toolbox, but that's a good idea ! I will look into it.

Best regards, Diviyan

diviyank avatar Apr 04 '22 07:04 diviyank

Thank you so much for your reply!!! So sorry that I forgot to reply you, you can close this question.

Best wishes.

MAX00008888 avatar Apr 13 '22 09:04 MAX00008888

Hello! You're welcome! I'll just keep this issue open to add the scores in the development roadmap

diviyank avatar Apr 13 '22 10:04 diviyank