neon icon indicating copy to clipboard operation
neon copied to clipboard

Timeline data management RFC

Open SomeoneToIgnore opened this issue 2 years ago • 4 comments

Rendered

SomeoneToIgnore avatar Jul 25 '22 12:07 SomeoneToIgnore

Great ideas, thank you for adding those.

  1. I lack the good knowledge on gc, compaction and page materialization, so not sure can suggest the right order myself. Yet I fully agree that something (task_manager as I've called it in the timeline data proposal diagram) has to manage tasks, reorder and batch them, thus avoiding extra locking, as it is done now with e.g. layer_flush_lock.

Without being able to go into details on the topic consciously, I've concentrated on good isolation proposal, since that is not that current case.

  1. Yes, I also believe that migrating to async is a way to go in our case, at least in all high-level parts, with multiple runtimes and maybe even threadpools for pagerequests and CPU-intensive low-level tasks (I've mentioned it in the same task_manager brief description on a diagram 🙂 ).

I think, we'll become mostly async if we follow current data separation RFC and on-demand download RFC: we don't have a good way to cancel tasks anyway and that would be needed there.

No particular knowledge on which operations should be extracted into their own CPU-bound runtime though, since that is expected to change after the locks are rearranged.

SomeoneToIgnore avatar Jul 26 '22 17:07 SomeoneToIgnore

May be it is time to return back to all-async model?

Personally I'm totally on board with this decision. But I dont think that we can agree on that (at least now).

To me pageserver is highly IO oriented, we dont do a lot of CPU heavy work so IMO async is natural fit for this case. Even if we wont go with io_uring yet we can use tokio's builtin threadpool to schedule multiple disk operations at a time to e g prefetch pages for our btree files etc etc

at least in all high-level parts, with multiple runtimes and maybe even threadpools for pagerequests and CPU-intensive low-level tasks

Keeping the distinction without strict borders (I mean when it is not isolated via some channel between sync and async world) wouldnt solve the mixing problem. Still there will be some cases when you want to grab a sync mutex from async code, and async one from sync code.

Is spawn_blocking for every page request a way to go in your opinion? @SomeoneToIgnore @knizhnik Then we will have some networking runtime with a couple of working threads and a separate pool of threads that run get_page requests scheduled via spawn_blocking. I can prototype this and we'll see how it behaves

LizardWizzard avatar Jul 27 '22 08:07 LizardWizzard

Is spawn_blocking for every page request a way to go in your opinion? @SomeoneToIgnore @knizhnik Then we will have some networking runtime with a couple of working threads and a separate pool of threads that run get_page requests scheduled via spawn_blocking. I can prototype this and we'll see how it behaves

Sounds expensive. GetPage is the one operation that is very latency-sensitive. I'd like to keep the happy-path of that as short as possible. By happy path I mean when all the data is already in the page cache, and there's no lock contention.

hlinnaka avatar Jul 27 '22 10:07 hlinnaka

Sounds expensive

Thats my expectation too :) Still curious will there be any notable difference according to our benchmarks.

Would be good to use async for networking, we've converged on that, but then if repository stays sync then async stuff will need to communicate with sync repo. Spawn blocking is one way to do this communication

LizardWizzard avatar Jul 27 '22 14:07 LizardWizzard

Agreed multiple times, that we should merge it deal something with this, presumably merge it. Feel free to revert the commit, if this was rushed.

SomeoneToIgnore avatar Sep 13 '22 19:09 SomeoneToIgnore