embucket-labs icon indicating copy to clipboard operation
embucket-labs copied to clipboard

Draft: Decrease query latency caused by puts into query history

Open YaroslavLitvinov opened this issue 3 months ago • 3 comments

Issue

With adding of asynchronous query execution it was added one more put to query history. Before we were just saving QueryRecord containing result, but now we also save intermediate QueryStatus::Runing. Saving intermediate query status enables our WebUI to see information about running queries, in a natural way just fetching a history. But this approach impacts significantly on a query execution latency.

Proposal

Add caching layer in SlateDBHistoryStore, so:

  • When put QueryRecord it added to the cache, then spawned async task and result returned without waiting completion of the put;
  • So no wait when saving items to the history;
  • Cache should work transparently, so HistoryStore interface behave like it access items as usually. Cached data mixed with the real data loaded from slatedb in proper way;
  • Saver task also uses ExecutionService::wait_historical_query_result to be notified when to remove item from the cache.

Thoughts

Implementing this would require some effort, maybe it's not a better timing for this to come up.

YaroslavLitvinov avatar Sep 24 '25 21:09 YaroslavLitvinov

You can also specify the write durability on a per put basis for SlateDB. Meaning that SlateDB will add the value to its cache and acknowledge the put right await without waiting for the object_store. This is essientially what you're suggesting just that it reuses the SlateDB cache.

JanKaul avatar Sep 25 '25 06:09 JanKaul

I was also thinking about persisting query details asynchronously:

  • Query struct is created and execution begins
  • Sending event "query started" to the channel
  • Receiver of such events is one that persist updates to the query in slatedb
  • Once query finishes (fails/completes), sends another event "query failed" / "query finished"

rampage644 avatar Sep 26 '25 01:09 rampage644

You can also specify the write durability on a per put basis for SlateDB. Meaning that SlateDB will add the value to its cache and acknowledge the put right await without waiting for the object_store. This is essientially what you're suggesting just that it reuses the SlateDB cache.

Yes, I support this method, it is easily providable for query history related puts

YaroslavLitvinov avatar Sep 26 '25 11:09 YaroslavLitvinov