Background knowledge not working
Hi! I have come across an issue where even when I forbid certain nodes when using PC, they still appear in my causal graph result:
from causallearn.utils.PCUtils.BackgroundKnowledge import BackgroundKnowledge
from causallearn.search.ConstraintBased.PC import pc
cg_without_background_knowledge = pc(X) # Run PC and obtain the estimated graph (CausalGraph object)
nodes = cg_without_background_knowledge.G.get_nodes()
bk = BackgroundKnowledge() \
.add_forbidden_by_node(nodes[0], nodes[1]) \
.add_forbidden_by_node(nodes[1], nodes[2]) \
.add_forbidden_by_node(nodes[0], nodes[2]) \
.add_forbidden_by_node(nodes[0], nodes[3]) \
.add_forbidden_by_node(nodes[1], nodes[3]) \
.add_forbidden_by_node(nodes[2], nodes[3])
cg_with_background_knowledge = pc(X, background_knowledge=bk)
assert cg_with_background_knowledge.G.get_edge(nodes[2], nodes[3]) is None
assert cg_with_background_knowledge.G.get_edge(nodes[0], nodes[1]) is None
I get this error, as well as an error for some other node combinations in my background knowledge:
AssertionError Traceback (most recent call last) Cell In[9], line 18 15 cg_with_background_knowledge = pc(X, background_knowledge=bk) 17 assert cg_with_background_knowledge.G.get_edge(nodes[2], nodes[3]) is None ---> 18 assert cg_with_background_knowledge.G.get_edge(nodes[0], nodes[1]) is None 20 g, edges = fci(X, background_knowledge=bk) 22 sns.heatmap(g.graph, cmap='coolwarm', annot=True, xticklabels=node_names, yticklabels=node_names)
AssertionError:
It seems like if the background knowledge is being used, it's being overriden somewhere. Please help! This also happens with FCI. I looked at the code and I can't figure out why this could be happening.
Thanks for reporting. Let us work on this and get back to you soon.
Hi @nataliaglazman , could you please try the following code to see if that works in your case?
bk = BackgroundKnowledge() \
.add_forbidden_by_node(nodes[0], nodes[1]) \
.add_forbidden_by_node(nodes[1], nodes[0]) \
.add_forbidden_by_node(nodes[1], nodes[2]) \
.add_forbidden_by_node(nodes[0], nodes[2]) \
.add_forbidden_by_node(nodes[0], nodes[3]) \
.add_forbidden_by_node(nodes[1], nodes[3]) \
.add_forbidden_by_node(nodes[2], nodes[3]) \
.add_forbidden_by_node(nodes[3], nodes[2])
I added add_forbidden_by_node(nodes[1], nodes[0]) and add_forbidden_by_node(nodes[3], nodes[2]). Since PC returns a lot of undirected edges, forbidding one direction might not guarantee that the other direction does not appear. I tested it in my dataset and it works well, but I'm not sure if that's the reason for your case. The function 'get_edge' gets the edge between two nodes, while the order of these two nodes does not matter.