pelikan icon indicating copy to clipboard operation
pelikan copied to clipboard

Add support for drop-in replacement of guava

Open brayniac opened this issue 2 years ago • 4 comments

It would be nice to offer the ability to use the seg storage library as a drop-in replacement for Guava. We should create a compatible library so that only relevant imports need to be changes to use Pelikan instead

brayniac avatar Jan 29 '23 17:01 brayniac

@beinan Can you or someone else from Alluxio tell us a bit more about your use cases and their ranges of object size, throughput, latency expectations, and what you wish to see in an in-process or lookaside local cache?

thinkingfish avatar Feb 19 '23 23:02 thinkingfish

From my discussion with @beinan, the use case is to cache fixed-size large blocks (e.g., 1 MB) with strong interests in TTL and NVMe support. They are mostly throughput bound and sensitive to CPU consumption, and latency is less of an issue.

1a1a11a avatar Feb 20 '23 02:02 1a1a11a

First things coming to mind are that these values most likely do not need to be serialized, but depending on how configurable the sizes are might lead to misalignment of segment/item with SSD block sizes due to the small amount of metadata per value.

Are these values subsequently sent over the network, or locally consumed without leaving the box? When you say "throughput bound", is that SSD or network throughput? And if the former, is it read or write throughput?

thinkingfish avatar Feb 20 '23 06:02 thinkingfish

Yeah, good catch on misalignment. For NVME support, we may want to keep all metadata in DRAM. I think they are consumed without leaving the box. Because the consumer is a computation-heavy service, which is sensitive to cache CPU utilization. If more CPU cycles are used by the cache, the application will have lower throughput.

1a1a11a avatar Feb 22 '23 19:02 1a1a11a