causal-learn
causal-learn copied to clipboard
Deleting edges takes a very long time, I came up with a solution.
I was running this program (newest version from a few days ago) on a dataset of roughly 6000 x 800 matrix and found in-between depth levels the program was taking a long time to get to the next depth level. I extrapolated and found this was going to take 331 days for just one depth level. I studied the code and eventually traced the bottleneck to GeneralGraph.remove_edge(edge1), which rebuilds the entire edge list in the final line self.reconstitute_dpath(self.get_graph_edges()). get_graph_edges is the problem as it traverses all of self.graph to build an edge list.
I fixed the problem by the following: create new functions remove_edge_only(self, edge: Edge) which is the same as remove_edge except it omits the last line self.reconstitute_dpath(self.get_graph_edges(). I also added a function
def clear_dpath(self):
self.dpath = np.zeros((self.num_vars, self.num_vars), np.dtype(int))
Then in SkeletonDiscovery.py:136, I call remove_edge_only instead of remove_edge. At the end of that loop, I call cg.G.clear_dpath() cg.G.reconstitute_dpath(cg.G.get_graph_edges()) so the edge list is rebuilt only once at the end of all edge deletions instead of after every individual edge deletion. This brought the runtime down to seconds and produced identical causal images on smaller datasets. Since this isn't my project, is there some kind of approval process I need to upload suggested changes like this to or is this something you all would want to implement yourselves?
Hi @delacylab,
Thanks so much for taking the time to debug and implement. I am also aware of this issue, but our current codebase lacks tests, and I am afraid that directly change remove_node() function will break our algorithms, and I decide to change it after we implement all the tests (in progress).
But your solution is definitely more elegant and only change the algorithm behavior of PC. I really like the idea of creating a new function with only removing edge. (Actually I feel stupid about not thinking in this way --- for some reason I think I need to change remove_node() directly and afraid of breaking other algorithms lol)
Regarding sending PR request, we are very happy to review the PR sent from our users, and we greatly appreciate the help from users. If you are willing to contribute to our project, how about sending the PR and assign me as reviewer? :) And if the PR is accepted, we will merge the PR into our main trunk.
Thanks so much for your help! :)
Some nips regarding PR requests:
- We follow standard industry review process now. https://github.com/cmu-phil/causal-learn/pull/44 is a great example. For each PR, generally we require a description and test plan. We already had good tests for PC, so running PC tests is the test plan for your PR. :)
- Could you rename remove_node() to remove_node_reconstitute_dpath() and remove_edge_only() to remove_edge()? This naming is more informative. :)
Hi @tofuwen
I wanted to check with you about renaming remove_edge() to remove_edge_reconstitute_dpath() and remove_edge_only() to remove_edge(). I agree it is more informative but would require changing all existing calls to GeneralGraph's remove_edge() to now call remove_edge_reconstitute_dpath() since the reconstitute_dpath behavior would be expected. I looked for GeneralGraph's remove_edge calls and found them in the following files: utils/GESUtils.py utils/DAG2PAG.py utils/PDAG2PAG.py utils/PCUtils/UCSepset.py utils/PCUtils/Meek.py utils/PCUtils/SkeletonDiscovery.py utils/PCUtils/BackgroundKnowledgeOrientUtils.py search/ConstraintBased/PC.py potentially search/ConstraintBased/FCI.py, which seems to be using a Graph class that is not implemented yet.
I am happy to change all these as part of the pull request but share your concern that directly changing remove_edge would break any algorithm if I missed any reference in my above list. So I just wanted to warn you this change would impact many more files than my original fix but I am happy to help!
Hi @delacylab,
Thanks so much for your help!
Yeah, if possible, you can change all of these files. We will have review to make sure your changes won't break anything (hopefully). :)
Thanks so much for your help. We really appreciate your time to make causal-learn better! :)
Hi @tofuwen,
Unfortunately it looks like there is a problem with my idea because I am failing unit tests. My initial testing showed that it didn't make any difference in the output if I implemented my fix as shown in my original post. But I fail some of the unit tests so my assumption that dpath could be cleared and reconstituted after a series of edge deletes would give the same result as adjusting dpath after each individual delete is wrong. It would speed up other aspects of the program besides edge deletion if edges were a class variable like graph or dpath to allow direct manipulation, but I don't know enough about your code to have any idea whether that is feasible or not. Sorry to not be of more help!
Sorry to intrude, but I'm very curious how that got implemented at a high level. For PC or GES, it seems you don't need a DPATH matrix, since you're not assume the graph is a DAG. Is there some way to make the d-path reasoning kick in only for DAGs (not CPDAGs or PAGs)? If you point me to where to look I can try to help...
Hi @delacylab,
Thanks so much for your help! :) No worry, we will take a look.
Hi @chenweiDelight, do you mind taking a look at @jdramsey's question?
@chenweiDelight @zhi-yi-huang
It seems https://github.com/cmu-phil/causal-learn/issues/54 is related. Do you mind taking a look and provide some context so we can fix? Currently remove_edge seems O(n^2), which is too slow, and it will influence all algorithm that use remove_edge() API. So this is a really high priority fix.
Sorry to reply so late. We use the DPATH matrix to check whether there forms a cyclic graph when using the meek rule. For example, if there is a cdpag '*-> x <-y ' and 'x - z ->y', and we use the 'the formation of v-structures‘ rule to determine the direction between x and z, then it is oriented as 'x->z', which will form a cyclic 'x->z->y->x'. @jdramsey, this example is from the WINE dataset. I am not sure whether we can provide the choice (about the graph should be acyclic or cyclic) for the users? We try to ensure that when removing the nodes, the variables (including DPATH) related to the GeneralGraph should be updated. This may cost a lot of time.