xtdb icon indicating copy to clipboard operation
xtdb copied to clipboard

Bulk eviction performance

Open refset opened this issue 4 years ago • 1 comments

Using multiple eviction operations in a single transaction does not currently scale well:

1 eviction "Elapsed time: 43.803538 msecs" 100 evictions "Elapsed time: 301.938175 msecs" 1,000 evictions "Elapsed time: 9686.182016 msecs" 10,000 evictions "Elapsed time: 1094110.583925 msecs"

This slowdown would presumably be much worse with a larger index (I am using the "TMDB" data set, which is initially 85MB in the Rocks index-store).

This is what the profiler shows during that final run: image

I initially expected to see that this performance was simply due to intensive IO scanning against Rocks (which is inherent to the current design), but having seen that flamegraph I believe this is an orthogonal and solveable issue.

This is what I was using to test:

(time (c/await-tx mynode
                  (c/submit-tx mynode
                               (for [[id] (c/q (c/db mynode)
                                               '{:find [e] :where [[e :crux.db/id]] :limit 10000})]
                                 [:crux.tx/evict id]))))
;; to create the node see https://github.com/crux-labs/reclojure-workshop/blob/master/src/myproject/core.clj#L7

refset avatar May 06 '21 15:05 refset

I believe the issue here is that the we're "storing" nil values for each of the entries we're unindexing: https://github.com/juxt/crux/blob/63e99523b73478490618c99760f3e28e2e1acf33/crux-core/src/crux/kv/index_store.clj#L1045 And then almost immediately filtering out all those nil values, across the entire mutable kv store, for every seek (multiple times per op) https://github.com/juxt/crux/blob/6d602bb5b6caed199f10fd8c3711cb034d49248a/crux-core/src/crux/kv/mutable_kv.clj#L12

One idea is that we could use something other than nil as the eviction-marker sentinel value when dealing with the mutable kv store (so that the seeks return fast once again, instead of filtering through ~endless nils), and only convert them into actually nil values during commit-index-tx

refset avatar Aug 14 '21 12:08 refset

Observed in the wild with LMDB https://juxt-oss.zulipchat.com/#narrow/stream/194466-xtdb-users/topic/workflow.20question.2C.20bad.20transaction.2C.20how.20to.20mitigate.20.28dev.29

refset avatar Nov 11 '22 21:11 refset