causal-learn Background knowledge not working

Hi! I have come across an issue where even when I forbid certain nodes when using PC, they still appear in my causal graph result:

from causallearn.utils.PCUtils.BackgroundKnowledge import BackgroundKnowledge
from causallearn.search.ConstraintBased.PC import pc

cg_without_background_knowledge = pc(X)  # Run PC and obtain the estimated graph (CausalGraph object)
nodes = cg_without_background_knowledge.G.get_nodes()

bk = BackgroundKnowledge() \
    .add_forbidden_by_node(nodes[0], nodes[1]) \
    .add_forbidden_by_node(nodes[1], nodes[2]) \
    .add_forbidden_by_node(nodes[0], nodes[2]) \
    .add_forbidden_by_node(nodes[0], nodes[3]) \
    .add_forbidden_by_node(nodes[1], nodes[3]) \
    .add_forbidden_by_node(nodes[2], nodes[3])

cg_with_background_knowledge = pc(X, background_knowledge=bk)

assert cg_with_background_knowledge.G.get_edge(nodes[2], nodes[3]) is None
assert cg_with_background_knowledge.G.get_edge(nodes[0], nodes[1]) is None

I get this error, as well as an error for some other node combinations in my background knowledge:

AssertionError Traceback (most recent call last) Cell In[9], line 18 15 cg_with_background_knowledge = pc(X, background_knowledge=bk) 17 assert cg_with_background_knowledge.G.get_edge(nodes[2], nodes[3]) is None ---> 18 assert cg_with_background_knowledge.G.get_edge(nodes[0], nodes[1]) is None 20 g, edges = fci(X, background_knowledge=bk) 22 sns.heatmap(g.graph, cmap='coolwarm', annot=True, xticklabels=node_names, yticklabels=node_names)

AssertionError:

It seems like if the background knowledge is being used, it's being overriden somewhere. Please help! This also happens with FCI. I looked at the code and I can't figure out why this could be happening.

Feb 05 '25 12:02 nataliaglazman

Thanks for reporting. Let us work on this and get back to you soon.

Feb 05 '25 23:02 kunwuz

Hi @nataliaglazman , could you please try the following code to see if that works in your case?

bk = BackgroundKnowledge() \
    .add_forbidden_by_node(nodes[0], nodes[1]) \
    .add_forbidden_by_node(nodes[1], nodes[0]) \
    .add_forbidden_by_node(nodes[1], nodes[2]) \
    .add_forbidden_by_node(nodes[0], nodes[2]) \
    .add_forbidden_by_node(nodes[0], nodes[3]) \
    .add_forbidden_by_node(nodes[1], nodes[3]) \
    .add_forbidden_by_node(nodes[2], nodes[3]) \
    .add_forbidden_by_node(nodes[3], nodes[2])

I added add_forbidden_by_node(nodes[1], nodes[0]) and add_forbidden_by_node(nodes[3], nodes[2]). Since PC returns a lot of undirected edges, forbidding one direction might not guarantee that the other direction does not appear. I tested it in my dataset and it works well, but I'm not sure if that's the reason for your case. The function 'get_edge' gets the edge between two nodes, while the order of these two nodes does not matter.

Feb 28 '25 22:02 kunwuz