pelikan
pelikan copied to clipboard
Add support for drop-in replacement of guava
It would be nice to offer the ability to use the seg storage library as a drop-in replacement for Guava. We should create a compatible library so that only relevant imports need to be changes to use Pelikan instead
@beinan Can you or someone else from Alluxio tell us a bit more about your use cases and their ranges of object size, throughput, latency expectations, and what you wish to see in an in-process or lookaside local cache?
From my discussion with @beinan, the use case is to cache fixed-size large blocks (e.g., 1 MB) with strong interests in TTL and NVMe support. They are mostly throughput bound and sensitive to CPU consumption, and latency is less of an issue.
First things coming to mind are that these values most likely do not need to be serialized, but depending on how configurable the sizes are might lead to misalignment of segment/item with SSD block sizes due to the small amount of metadata per value.
Are these values subsequently sent over the network, or locally consumed without leaving the box? When you say "throughput bound", is that SSD or network throughput? And if the former, is it read or write throughput?
Yeah, good catch on misalignment. For NVME support, we may want to keep all metadata in DRAM. I think they are consumed without leaving the box. Because the consumer is a computation-heavy service, which is sensitive to cache CPU utilization. If more CPU cycles are used by the cache, the application will have lower throughput.