CausalDiscoveryToolbox icon indicating copy to clipboard operation
CausalDiscoveryToolbox copied to clipboard

[DATA] Different settings of the sachs dataset

Open wangxw5 opened this issue 4 years ago • 4 comments

I have some questions about the sachs data set. The sachs dataset loaded from this package contains 11 nodes, 18 edges, and 7466 samples. But this statistics is different from other papers. In some r language packages, I found that the sachs data set includes 11 nodes, 17 edges, and 7466 samples; in the DAG-GNN paper [1], there are 11 nodes, 20 edges, and 7466 samples; in [2], there are 11 nodes, 17 edges, and 853 samples. So which data is correct?

[1] DAG-GNN: DAG Structure Learning with Graph Neural Networks [2] Causal Discovery with Reinforcement Learning

wangxw5 avatar Dec 10 '20 13:12 wangxw5

Hello, The sachs dataset was updated at different times, and depending on which period the dataset was used, the ground truth actually changed, thanks to new real-world experiments. Our version is fairly old, thus I think the DAG-GNN version should be more up to date ; do you have a link of the new dataset and its ground truth? Thanks in advance, Diviyan

diviyank avatar Dec 10 '20 13:12 diviyank

Thank you very much for your reply! I have not obtained the data used in DAG-GNN. But I am going to write to the author to obtain it. After I get it, I will share it with you^_^

wangxw5 avatar Dec 10 '20 13:12 wangxw5

Thanks a lot for the update, i'ill keep this issue open to remind me of the out-of-date dataset

diviyank avatar Dec 10 '20 13:12 diviyank

Dear @wangxw5 , Have you been able to obtain the new dataset?

Kind regards, Diviyan

diviyank avatar Feb 02 '21 15:02 diviyank