datahike icon indicating copy to clipboard operation
datahike copied to clipboard

[Bug]: datahike.migrate has a problem with schema/double (which cbor converts to float)

Open awb99 opened this issue 2 years ago • 2 comments

What version of Datahike are you using?

0.6

What version of Java are you using?

17

What operating system are you using?

guix

What database EDN configuration are you using?

irrelevant to this ticket

Describe the bug

I started my datahike database with version 0.4 with schema-on-write. I ONLY used double in the schema definition. Now I had to migrate the 0.4 edn dumps to import in 0.6 which uses cbor dumps. Previously I was exporting in 0.4 edn dumps which I would import on another machine without a problem. Now I have noted that when the edn dumps are read via the edn parser and imported into 0.6 that I had to make an adjustment to change all float values to double values. Interestingly, this was not a problem before.

What is the expected behaviour?

put into readme that :double has a different meaning in 0.4 in relationship to 0.6.

How can the behaviour be reproduced?

(defn float->double [v]
  (if (float? v)
    (double v)
    v))

(defn db-migrate []
  (warn "migrating cbor db..")
  (let [txs-old (load-cbor crbdb/conn "data/datahike-dump/eavt-dump")
        s (stats txs-old)
        max-eid (:max-eid s)
        max-tx (:max-tx s)
        tx-old-no-schema (remove datom-schema? txs-old)
        tx-old-safe (map #(-> %
                              (update :v float->double))
                         tx-old-no-schema)
        schema-with-eids (assoc-schema-ids max-eid schema)]
    (warn "dump stats: " s)
    (warn "transacting schema with eid above: " max-eid)
    (let [result-schema (crbdb/transact schema-with-eids)]
      (print-tx-stats result-schema))
    (let [result-import (api/transact crbdb/conn (vec tx-old-safe))]
      (warn "import dump result:")
      (print-tx-stats result-import))
    ;(assoc crbdb/conn :max-tx max-tx)
    (warn "db migration finished!")))

awb99 avatar Jul 26 '23 04:07 awb99

After more testing, this is a bug in the cbor persistence layer. Just calling datahike.migrate/export-db and datahike.migrate.import-db results to import errors for doubles which cbor seems to store as float:

[datahike.db.transaction:45] - Bad entity value 0.0 at [:db/add 2199133 :lineitem/price 0.0 536871102] , value does not match schema definition. Must be conform to: double? {:error :transact/schema, :value 0.0, :attribute :lineitem/price, :schema #:db{:valueType :db.type/double, :cardinality :db.cardinality/one, :ident :lineitem/price}}

awb99 avatar Jul 27 '23 19:07 awb99

Hey @awb99 , thanks for reporting this issue and great find! We value contributions and are happy to help with it. In case you find some time to fix this, please don't hesitate. We are all pretty busy lately and I really hope this issue can be resolved soon.

TimoKramer avatar Jul 28 '23 07:07 TimoKramer