datalevin icon indicating copy to clipboard operation
datalevin copied to clipboard

sort-by with db/-count breaks queries on empty database before transact

Open andersmurphy opened this issue 1 year ago • 1 comments
trafficstars

UPDATE: THIS PR DOES NOT FIX THE ISSUE

What this PR does

~~Reverts the sort (short term fix). Feel free to merge it or~~ close it when you work out the caching issue. I tried to delve into the cache stuff but that is going to take me a fair bit longer to get to grips with.

What the issue was

So I’ve encountered this weird issue where if I run a query on an empty database before running a transaction it makes subsequent queries return nil for that attribute.

This should work, but doesn’t.

(comment
  (def schema
    {:transaction/signature
     {:db/unique      :db.unique/identity
      :db/valueType   :db.type/string
      :db/cardinality :db.cardinality/one}
     :transaction/block-time
     {:db/valueType   :db.type/long
      :db/cardinality :db.cardinality/one}})
  
  (def conn
    (d/get-conn "db1" schema
      {:validate-data?    true
       :closed-schema?    true
       :auto-entity-time? true}))

  (d/q '[:find [?block-time ?signature]
         :where
         [?t :transaction/signature ?signature]
         [?t :transaction/block-time ?block-time]]
    @conn)

  (d/transact! conn [{:transaction/signature  "foo"
                      :transaction/block-time 234324324}])

  (d/q '[:find [(max ?bt)]
         :where
         [?t :transaction/block-time ?bt]]
    @conn)
  ;; => nil
  )

This does work.

(comment
  (def schema
    {:transaction/signature
     {:db/unique      :db.unique/identity
      :db/valueType   :db.type/string
      :db/cardinality :db.cardinality/one}
     :transaction/block-time
     {:db/valueType   :db.type/long
      :db/cardinality :db.cardinality/one}})
  
  (def conn
    (d/get-conn "db2" schema
      {:validate-data?    true
       :closed-schema?    true
       :auto-entity-time? true}))
  
  (d/transact! conn [{:transaction/signature  "foo"
                      :transaction/block-time 234324324}])

  (d/q '[:find [?block-time ?signature]
         :where
         [?t :transaction/signature ?signature]
         [?t :transaction/block-time ?block-time]]
    @conn)
  ;; => [234324324 "foo"]

  (d/q '[:find [(max ?bt)]
         :where
         [?t :transaction/block-time ?bt]]
    @conn)
  ;; => [234324324]

  )

So it’s this commit that introduces the issue:

https://github.com/juji-io/datalevin/commit/720a7d79b8800e58fb6136c76698b102ec77afdc

I've also tried reverting this commit on the latest master and it fixes the issue.

So this seems to be caused by this line:

https://github.com/juji-io/datalevin/blob/master/src/datalevin/query.clj#L1468

sort-by (fn [[_ attr _]] (db/-count db [nil attr nil]))

My guess is db/-count is stateful and is now getting called in the reduce and the sort, where as before it was just being called in the reduce. So caching most likely.

andersmurphy avatar Aug 24 '24 14:08 andersmurphy

Further testing. This DOES NOT fix the issue.

So I've run into a variation of this issue with db.type/string that the above change does not fix.

Seems to be instroduced in 890dc80fc650f3b6145e304259721c701709073b

So I wonder if there's something more complicated going on here.

andersmurphy avatar Aug 27 '24 12:08 andersmurphy

Thanks. I will investigate more.

huahaiy avatar Sep 04 '24 23:09 huahaiy

I encountered kind of same problem. Maybe this information could help in investigation:

Running such code:

(d/transact! (app/db-conn) [[:db/add -1 :srv/user-name "admin"]
                            [:db/add -1 :srv/id 777]])
(let [db (d/db (app/db-conn))]
  (clojure.pprint/pprint ["DATOMS:" (d/datoms db :eav 37)])
  (println "Q1:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where [?e :srv/user-name]]
                        db))
  (println "Q2:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where [?e :srv/id]]
                        db))
  (println "Q3:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where (or [?e :srv/id])]
                        db))))

expectedly outputs:

│ ["DATOMS:" ([37 :srv/user-name "admin"] [37 :srv/id 777])]
│ Q1:    [{:db/id 37, :srv/user-name admin, :srv/id 777}]
│ Q2:    [{:db/id 37, :srv/user-name admin, :srv/id 777}]
│ Q3:    [{:db/id 37, :srv/user-name admin, :srv/id 777}]

But if we add before transaction a query Q0 same as Q2 like this:

(println "Q0:\t" (d/q '[:find [(pull ?e [*]) ...]
                        :where [?e :srv/id]]
                      (d/db (app/db-conn))))
(d/transact! (app/db-conn) [[:db/add -1 :srv/user-name "admin"]
                            [:db/add -1 :srv/id 777]])
(let [db (d/db (app/db-conn))]
  (clojure.pprint/pprint ["DATOMS:" (d/datoms db :eav 37)])
  (println "Q1:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where [?e :srv/user-name]]
                        db))
  (println "Q2:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where [?e :srv/id]]
                        db))
  (println "Q3:\t" (d/q '[:find [(pull ?e [*]) ...]
                          :where (or [?e :srv/id])]
                        db))))

we expect Q0 returns empty collection, but it also interfere output of Q2:

│ Q0:    []
│ ["DATOMS:" ([37 :srv/user-name "admin"] [37 :srv/id 777])]
│ Q1:    [{:db/id 37, :srv/user-name admin, :srv/id 777}]
│ Q2:    []
│ Q3:    [{:db/id 37, :srv/user-name admin, :srv/id 777}]

As we can see such behaviour interfere only exactly same query. Even putting initial clause inside redundant or fixes the moment. That's why I guess that something about caching works bad...

aldebogdanov avatar Sep 14 '24 17:09 aldebogdanov

Having the same issue. Querying for a specific attribute value works, but querying with a placeholder doesn't. Excerpt from my code:

(d/q '[:find ?u . :where [?u :garden.user/name _]] @db/conn)
; => nil
(d/q '[:find ?u . :where [?u :garden.user/name "johnny"]] @db/conn)
; => 2

I can also confirm that reverting to version 0.9.8 fixed it for me.

JohnnyJayJay avatar Sep 29 '24 10:09 JohnnyJayJay