risinglight icon indicating copy to clipboard operation
risinglight copied to clipboard

storage: abstract disk I/O

Open skyzh opened this issue 2 years ago • 7 comments

Currently, we have create_dir, etc. everywhere. We'd better have a single interface to operate files on disk, so as to support in-memory / disk / object store backends.

skyzh avatar Jun 09 '22 14:06 skyzh

Isn't that only the disk backend depends on file operations?

wangrunji0408 avatar Jun 13 '22 04:06 wangrunji0408

Yes. But even disk backend has two modes -- pure in-memory mode (for testing, where files are stored in a hash map), and real on-disk mode.

skyzh avatar Jun 13 '22 05:06 skyzh

If we can abstract all disk operations to use a trait like ObjectStore, we can prevent unwanted writes to disk in in-memory secondary storage.

skyzh avatar Jun 13 '22 05:06 skyzh

I think we should use the real on-disk mode in testing, to make sure we are correctly using the system fs API. Data path can be redirected to ramfs/tmpfs to speed up.

wangrunji0408 avatar Jun 13 '22 05:06 wangrunji0408

Later we can also introduce the simulation testing, where all fs API will be mocked to an in-memory simulator.

wangrunji0408 avatar Jun 13 '22 05:06 wangrunji0408

I switch to pure in-memory mode because tmpfs is slow 🤣 fsync and manifest write will add 1s latency to every disk test case, which looks weird.

Maybe I can try switch off fsync for all writes (even manifest), and maybe things will work.

skyzh avatar Jun 13 '22 05:06 skyzh

🤣 okay let's stay in memory mode for efficiency.

wangrunji0408 avatar Jun 13 '22 05:06 wangrunji0408