datahike icon indicating copy to clipboard operation
datahike copied to clipboard

Consider running transaction function calls in parallel

Open w9 opened this issue 5 years ago • 2 comments

Currently, multiple :db.fn/call's are executed sequentially. Since these functions are mandated to be pure, they need not block each other. They should be able to run in parallel. Their results can be collected sequentially.

https://github.com/replikativ/datahike/blob/9e4e619d5f06feedacc9d31ad13da88ac2f409d2/src/datahike/db.cljc#L1468-L1473

w9 avatar Oct 21 '20 18:10 w9

I think this would require a breaking change because the current behaviour, inherited from DataScript, evaluates transaction functions using intermediate "speculative" db values based on the previous operations applied within a transaction. It would therefore be problematic to decide whether a list of transaction function operations could be safely "expanded" (i.e. evaluated) in parallel or not, as successive assertions and queries (within the functions) could easily overlap and influence each other.

By contrast, Datomic transaction functions are only passed the db value from the start of the transaction, regardless of other operations that may appear before a given transaction function invocation operation, and this is ~trivial to parallelize - that is my understanding anyway, based on:

The transaction processor will lookup the function in its :db/fn attribute, and then invoke it, passing the value of the db (currently, as of the beginning of the transaction) https://docs.datomic.com/on-prem/reference/database-functions.html#processing-transaction-functions

I have never confirmed the Datomic behaviour first-hand though, perhaps someone can check / correct me :)

FWIW Crux also implements the same speculative/serial behaviour as DataScript, since https://github.com/juxt/crux/pull/933

refset avatar Mar 03 '21 11:03 refset

@refset @w9 I would also stick to serializability as the default (same as DataScript) and only opt-out with the user's permission via fine-grained concurrency/consistency controls. Having said this we could introduce a new :db.fn/commutative-call, @w9 would something like this work for you? We could then filter those out and apply them in parallel on the initial value of the DB (like Datomic).

whilo avatar Mar 03 '21 17:03 whilo