problog Hanging Collective Transitive Rules

Hey All,

I am having some issues with transitive rules and I was wondering if my results were expected or if there may be a bug lurking here.

I have a transitive rule that I have tried to implement in two different ways, both of which have unfavorable outcomes (consuming all memory and a presumably infinite loop).

All methods were run using the CLI from the ProbLog Python package (from PyPi, version 2.2.2):

python -m problog -v test_file.txt

Any help you can provide in getting either of these methods (or any other method of dealing with collective transitivity) would be greatly appreciated.

General Model

This is a small synthetic model with the end goal of inferring whether two people know each other. The data and structure for this model comes from the psl-examples repository: https://github.com/linqs/psl-examples/tree/develop/simple-acquaintances The specific rules that my initial ProbLog rules are based on come from here: https://github.com/linqs/psl-examples/blob/develop/simple-acquaintances/cli/simple-acquaintances.psl

These examples will only use three predicates:

knows/2 - The final target predicate, indicating two people know each other.
knows_l1/2 - An intermediate predicate for knows/2.
knows/3 - Used in the second method to carry a list of seen nodes.

In the examples I will provide, I stripped down the rules and data to the smallest set that still causes these issues (I have not fully tested every possible subset of the data since these can take a while, but I have cut it down considerably).

First Method

The first method uses a very straightforward approach to transitivity (with the hopes that memoization will stop cycles from happening).

knows(P1, P2) :- knows_l1(P1, P2), (P1 =\= P2) .
0.167 :: knows(P1, P2) :- knows_l1(P1, OtherPerson), knows(OtherPerson, P2), (P1 =\= OtherPerson), (P1 =\= P2) .

1.0 :: knows_l1(0, 2) .
% ... More data here.

query(knows(0, 1)) .

The probabilistic facts enumerates through [0, 9] for each argument and excludes self references and the query (0, 1). The full file is attached: problog_transitive_method1.txt I also attached the output of a profile (py-spy top): problog_transitive_method1_profile_top.txt

When run, it hangs on the output:

[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s
[INFO] Grounding: 0.0396s
[INFO] Cycle breaking: 2.8354s

The process will continue to use ~100% of a core and will keep accumulating memory until there is no more or it is killed. On my machine, the this took about 50GB of RAM and swap combined.

Second Method

The second method uses a commonly recommended pattern for transitivity in ProLog, by keeping a list of previously seen nodes.

:- use_module(library(lists)).

knows(P1, P2) :- knows(P1, P2, []) .

knows(P1, P2, V) :-
    knows_l1(P1, P2),
    \+ memberchk((P1, P2), V) .

knows(P1, P2, V) :-
    knows_l1(P1, OtherPerson),
    \+ memberchk((P1, OtherPerson), V),
    knows(OtherPerson, P2, [(P1, OtherPerson) | V]) .

1.0 :: knows_l1(0, 2) .
% ... More data here.

query(knows(0, 1)) .

The probabilistic facts enumerates through [0, 4] for each argument and excludes self references and the query (0, 1). The full file is attached: problog_transitive_method2.txt I also attached the output of a profile (py-spy top): problog_transitive_method2_profile_top.txt

When run, it hangs on the output:

[INFO] Output level: INFO
[INFO] Propagating evidence: 0.0000s

So it looks like it is stuck in grounding. The process will continue to use ~100% of a core, but will not use more RAM. I ran this for about 12 hours until I killed it.

Feb 22 '22 03:02 eriq-augustine

Thank you for the detailed explanation. The programs you provided contains some very cyclical rules. We are aware of some performance issues regarding cycle breaking, and we're looking into it right now. We'll let you know when we have an update.

Feb 24 '22 16:02 rmanhaeve

Thanks, Robin. Keep me up to date and let me know if you need anything else.

Feb 24 '22 17:02 eriq-augustine

Hello, I looked into it a bit.

First Method

The compilation step, a #P-complete problem, causes the long processing time. I tested this on both SDD (-k sdd) and dSharp (-k ddnnf).

There might be a bug that makes some ground programs unnecessarily larger, something Robin and I were looking into, but I have yet to confirm whether that occurs here too. The ProbLog problem you gave seems to just scale terribly, and so does the compilation process.

Below are some statistics.

SDD

Excluding nodes 7 and 8 - using SDD without minimization ground program size: 167 SDD size (all / live / dead): 148 444 / 54 197 / 94 247 SDD node (all / live / dead): 36 935 / 14 431 / 22 504 SDD compilation time: 0.2904s

Excluding nodes 7 and 8 - using SDD with minimization ground program size: 167 SDD size (all / live / dead): 17 950 / 10 293 / 7 657 SDD node (all / live / dead): 5 819 / 3 205 / 2 614 SDD compilation time: 1.1790s

Excluding node 8 - using SDD without minimization ground program size: 230 SDD size (all / live / dead): 6 363 661 / 1 519 273 / 4 844 388 SDD node (all / live / dead): 1 139 949 / 265 390 / 874 559 SDD compilation time: 27.3208s

Excluding node 8 - using SDD with minimization ground program size: 230 SDD size (all / live / dead): 252 681 / 180 991 / 71 690 SDD node (all / live / dead): 57 804 / 36 863 / 20 941 SDD compilation time: 50.3182s

Original problem - using SDD without minimization ground program size: 303 unable to complete

Original problem - using SDD with minimization ground program size: 303 unable to complete

dSharp

Excluding nodes 7 and 8 ground program size: 167 ground cycle-free program size: 528 CNF vars / clauses: 528 / 1489 compilation time: crashed at 984s

Excluding node 8 ground program size: 230 ground cycle-free program size: 1588 CNF vars / clauses: 1588 / 4876 compilation time: did not bother letting it continue

Original problem ground program size: 303 ground cycle-free program size: 4548 CNF vars / clauses: 4548 / 14551 compilation time: did not bother letting it continue

Second Method

The issue is with grounding the program (so not even cycle breaking). Apparently, as is also seen in profile report, it is spending most of the time in the Term's __eq__ function.

Since this is not cycle breaking related, and Robin knows most about the grounding, I'll reassign him. I will revisit the First Method to see if the ground program contains anything unnecessary but I will have to attend to some other tasks first so it might take a few days.

Code I used to analyse each steps

Code


import time
from problog.formula import LogicDAG
from problog.program import PrologFile
from problog.logic import Term
from problog.engine import DefaultEngine
from problog.ddnnf_formula import DDNNF
from problog.cnf_formula import CNF
from problog.sdd_formula import SDD


class Timer(object):
  def __init__(self, msg):
      self.message = msg
      self.start_time = None

  def __enter__(self):
      self.start_time = time.time()

  def __exit__(self, *args):
      print("%s: %.4fs" % (self.message, time.time() - self.start_time))


def main(filename, k):
  model = PrologFile(filename)

  # default label_all=False
  engine = DefaultEngine(label_all=True)

  with Timer("parsing"):
      db = engine.prepare(model)

  print("\n=== Queries ===")
  queries = engine.query(db, Term("query", None))
  print("Queries:", ", ".join([str(q[0]) for q in queries]))

  print("\n=== Evidence ===")
  evidence = engine.query(db, Term("evidence", None, None))
  print("Evidence:", ", ".join(["%s=%s" % ev for ev in evidence]))

  print("\n=== Ground Program ===")
  with Timer("ground"):
      gp = engine.ground_all(db)
  print(gp)

  semiring = None
  if k == 'sdd':
      print("\n=== SDD ===")
      with Timer("SDD"):
          kc = SDD.create_from(gp, engine=engine, semiring=semiring, sdd_auto_gc=False)
          sdd_manager = kc.get_manager().get_manager()
          print(f"SDD size (all / live / dead): {sdd_manager.size()} / {sdd_manager.live_size()} / {sdd_manager.dead_size()}")
          print(f"SDD node (all / live / dead): {sdd_manager.count()} / {sdd_manager.live_count()} / {sdd_manager.dead_count()}")
  else:
      print("\n=== DAG ===")
      with Timer("DAG"):
          dag = LogicDAG.createFrom(gp)
          print(dag)

      print("\n=== DDNNF ===")
      with Timer("DDNNF"):
          cnf = CNF.create_from(dag, engine=engine, semiring=semiring)
          cnf_file_path = "./analysis_file_cnf.out"
          with open(cnf_file_path, "w") as cnf_file:
              cnf_file.write(cnf.to_dimacs())
          print(f"Printed CNF file to {cnf_file_path}")
          kc = DDNNF.create_from(cnf, engine=engine, semiring=semiring)

  print("\n=== Evaluation ===")
  with Timer("Evaluation"):
      result = kc.evaluate(semiring=semiring)
      print(result)


if __name__ == "__main__":
  import argparse

  argparser = argparse.ArgumentParser()
  argparser.add_argument("filename")
  argparser.add_argument("-k")
  args = argparser.parse_args()

  main(**vars(args))

Feb 28 '22 13:02 VincentDerk

Thanks for the update. Let me know if you need anything on my side.

Feb 28 '22 15:02 eriq-augustine

About method 2: I don't think it's too surprising that it's slow. The grounding process grows exponentially with the number of knows_l1/2 facts. SWIPL too slows down on this problem. You can see this by running time(findall(0, knows(0,1), _)). and commenting out a few knows(X,Y) and knows(Y,X). (and removing the probability labels, ofcourse).

Tabling doesn't helps problog here, as the 3rd argument keeps changing. It's even possible that tabling is adding to the slowdown once the tables grow big.

Feb 28 '22 18:02 krishnangovindraj

@krishnangovindraj The exponential grounding time (and increased time due to memoizing the list) makes sense, but do you think that's all there is to it? I ran this for 12 hours at 4GHz. Do you think I should have just run it for longer, or there is a bug causing an infinite loop?

Feb 28 '22 18:02 eriq-augustine

I have let an instance of the second method run for a week now and it is still going. It does slowly increase it's memory consumption, about 0.25 GB a day.

Mar 08 '22 17:03 eriq-augustine

I was able to run the first method to completion on a larger machine. It took 27 hours and 148 GB, and the output looks like:

[DEBUG] Output level: DEBUG
[INFO] Propagating evidence: 0.0000s
[DEBUG] Grounding query 'knows(0,1)'
[DEBUG] Ground program size: 303
[DEBUG] Propagated evidence: []
[INFO] Grounding: 0.0596s
[DEBUG] Ground program size: 4548
[INFO] Cycle breaking: 2.6656s
[INFO] Clark's completion: 0.0127s
[INFO] DSharp compilation: 100008.6578s
[INFO] Total time: 100011.4259s
knows(0,1): 1

Mar 12 '22 02:03 eriq-augustine

I've finished tracking down a bug today. I hope to revisit this one in a few days but it might take a while. Something I did already notice and is perhaps related is the following:

The ground program of

:- use_module(library(lists)).
1/2::throw([1]).
t :- throw(L), member(4, L).
query(t).

should be empty since member(4,L) will never be true and is hence irrelevant to proof query(t). However, as you can see below, the ground program is not empty as it likely only realised afterwards that the atom was irrelevant. These unnecessary lines (or at least the ones I identified) do disappear when going to the LogicDAG after cycle breaking so they do not impact the compilation time. I doubt optimizing this will help the grounding time because you only notice later that it's irrelevant. It does however impact memory (and maybe cycle breaking run time?) so it might be something we can consider looking deeper into.

=== Ground Program ===
ground: 0.0010s
1: atom(identifier=11, probability=1/2, group=None, name=throw([1]), source=None, is_extra=False)
Queries : 
* t : None [query]

=== DAG ===

Queries : 
* t : None [query]

=== Evaluation ===
{t: 0.0}

Mar 16 '22 00:03 VincentDerk

problog problog copied to clipboard

Hanging Collective Transitive Rules

General Model

First Method

Second Method

First Method

Second Method

Code I used to analyse each steps

problog
problog copied to clipboard