deepdive
deepdive copied to clipboard
Document assignment in ddlog
I noticed today that I had at least 5 duplicates of my sentences input in the database, which was slowing my down, but that's not the point.
I went to search for documentation on all of the different ddlog assignment-like symbols, and didn't know where to start. There seems to be =, +=, :- and maybe others? It doesn't seem particularly appropriate to use += for an input, but I don't understand the semantics of this symbol, so I'm not sure that changing it to '=' is appropriate. Having a page to clarify such things would be helpful.
@syadlowsky Those are great points. The syntax as is can be quite confusing and some cleanup is in order... and yes, better documentation. For now, you can see the distinctions in this section: http://deepdive.stanford.edu/example-spouse#2-distant-supervision-with-data-and-rules
In short:
-
=
is for supervision (assigning TRUE / FALSE / NULL to a tuple), not data flow; e.g.is_spouse(p1, p2) = TRUE :- golden_marriage(p1, p2)
-
+=
populates a relation with a UDF; e.g.,is_spouse_cands += extract_cands(sid, text) :- sentences(sid, text)
-
:-
delimits the rule's head from body, as in the above examples. Besides the above complex heads, the rule can also be simplyis_spouse_cands(p1, p2) :- all_person_pairs(p1, p2)
which is just a (optionally materialized) view.
All of those cases are also mentioned in the above link.