datascript icon indicating copy to clipboard operation
datascript copied to clipboard

rules aren't correctly unifying

Open xificurC opened this issue 5 years ago • 5 comments

The problem: walk references until finding the root reference.

This example returns {:id :a :root 1} in the results as expected for datomic, but fails in datascript:

(let [c (db/empty-db {:id {:db/unique :db.unique/identity}})]
    (db/q '[:find (pull ?root [*]) :in $ ?id % :where [?e :id ?id] (root ?id ?root)]
          (db/db-with c [{:id :a :root 1} {:id :b :parent [:id :a]} {:id :c :parent [:id :b]} {:id :d :root 2}])
          :b
          '[[(root ?e ?e) (not [?e :parent _])]
            [(root ?e ?r) [?e :parent ?p] (root ?p ?r)]]))
;; => ExceptionInfo Insufficient bindings: none of #{?root} is bound in (not [?root :parent _])  clojure.core/ex-info (core.clj:4739)

This example throws an ArrayIndexOutOfBoundsException in datomic :man_shrugging: while in datascript returns too many results (however did it reach {:id :d}?):

(let [c (db/empty-db {:id {:db/unique :db.unique/identity}})]
    (db/q '[:find (pull ?root [*]) :in $ ?id % :where [?e :id ?id] (root ?id ?root)]
          (db/db-with c [{:id :a :root 1} {:id :b :parent [:id :a]} {:id :c :parent [:id :b]} {:id :d :root 2}])
          :b
          '[[(root ?e ?e) [?e :root _]]
            [(root ?e ?r) [?e :parent ?p] (root ?p ?r)]]))
;; => ([{:db/id 4, :id :d, :root 2}] [{:db/id 1, :id :a, :root 1}])

xificurC avatar Jan 31 '20 17:01 xificurC

Hi!

To do this you need two rules: one finds all parents, the other only leaves the ones that are roots (have no :parent themselves).

(db/q '[:find (pull ?root [*])
        :in $ ?id %
        :where [?e :id ?id]
               (root ?e ?root)]
  (-> (db/empty-db {:id {:db/unique :db.unique/identity}})
    (db/db-with  [{:id :a :root 1}
                  {:id :b :parent [:id :a]}
                  {:id :c :parent [:id :b]}
                  {:id :d :root 2}]))
  :b
  '[[(parent ?e ?p)
     [?e :parent ?p]]
    [(parent ?e ?p)
     [?e :parent ?x]
     (parent ?x ?p)]
    [(root ?e ?r)
     (parent ?e ?r)
     (not [?r :parent])]])

;; ([{:db/id 1, :id :a, :root 1}])

tonsky avatar Feb 01 '20 20:02 tonsky

Thank you! I can use that in my application.

I still find the first 2 results unsatisfying and unsettling respectively:

  • the first version works in datomic. The not query is only ever used with the variable bound. It might be non-trivial to prove statically, I agree. Ideally it should be accepted though, from a user's perspective

  • the second query just seems to return the wrong result. It might just be me walking the expression wrong, would you care to eborate on it?

xificurC avatar Feb 01 '20 21:02 xificurC

Right now self-unification in rule signature is not supported in Datascript. Maybe it should, I don’t know. I’m not sure it’s supported by Datomic even.

(root ?e ?e)

Then body is confusing too

(not [?e :parent _])

It means “Add all roots to the result set”. Which is not what you want: you want all parents INTERSECTED with all roots. This will UNIONIZE them.

And then you pass ?id to the rule where you should’ve passed ?e. It’s very strange it worked at all.

tonsky avatar Feb 02 '20 20:02 tonsky

Right now self-unification in rule signature is not supported

Yes, that's what I was going for. Good to know, thanks. I couldn't figure out how to keep only the root and later realized unifying the rule arguments should do that. Your example with 2 rules works for this case, thanks.

(not [?e :parent _])

It means “Add all roots to the result set”.

IIUC that would happen only if ?e was a free variable (is that the correct term?) which is not possible as demonstrated by the "insufficient bindings" exception. The datomic docs say:

The results of the subquery are removed from the enclosing query via set difference.

So the not clause should only ever remove results, never add. Is my understanding once again incorrect?

And then you pass ?id to the rule where you should’ve passed ?e. It’s very strange it worked at all.

Strange indeed, that was a typo.

I'm thinking what is the correct step now with this issue? Both of my examples relied on unifying the rule arguments and if that's not supported maybe that should be checked and throw?

xificurC avatar Feb 03 '20 10:02 xificurC

So the not clause should only ever remove results, never add. Is my understanding once again incorrect?

It’s not about not, it’s about rule having two branches. Results from individual branches are grouped as with OR, not as with AND (in SQL terms).

I'm thinking what is the correct step now with this issue? Both of my examples relied on unifying the rule arguments and if that's not supported maybe that should be checked and throw?

In the shortest future, that would be a good warning sign, yes. Propert solution would be to implement self-unification, I guess.

tonsky avatar Feb 03 '20 14:02 tonsky