CacheLib icon indicating copy to clipboard operation
CacheLib copied to clipboard

File-mapped memory support in shared memory manager

Open guptask opened this issue 2 years ago • 9 comments

Added file-based SHM segment option in SHM module and CacheAllocator. This PR allows any memory-mapped file to be used as a memory tier in CacheLib. However, these changes only allow use of a single tier currently. This is first half of the changes needed to enable multi-tiering in CacheLib. The multi-tier config APIs are in a separate PR (https://github.com/facebook/CacheLib/pull/138). These config APIs are needed to enable multi-tiering. Once that PR is merged, the second half of the multi-tier changes can be sent upstream via a separate PR.

guptask avatar Jun 15 '22 08:06 guptask

Did anyone get a chance to review this PR ?

guptask avatar Jul 12 '22 17:07 guptask

I'm in the process of reviewing. I am curious that do we have to associate a path and a name with each file segment? Can we use only name to identify a file segment, like we do with Posix? If yes, we can greatly simplify the change and avoid a cold restart.

haowu14 avatar Jul 13 '22 21:07 haowu14

I'm in the process of reviewing. I am curious that do we have to associate a path and a name with each file segment? Can we use only name to identify a file segment, like we do with Posix? If yes, we can greatly simplify the change and avoid a cold restart.

Are you referring to the addition of PosixSysVSegmentOpts argument in removeShm ? If not, can you please add more details in your query ?

guptask avatar Jul 18 '22 09:07 guptask

It's this and why we need to have each FileSegment have a path instead of just a name and figure out the path from the name.

Just like Posix always writes segments to /dev/shm, we can have FileSegments always writes to some base path and FileSegment can be located at base path + function(name). This way, we just need to change usePosix_ into an enum and the majority of the code structure can be unchanged.

haowu14 avatar Jul 18 '22 18:07 haowu14

@haowu14 the idea is that each FileSegment can represent a different memory type, and so each will have a different base path. For example, our most common use case is to have a segment on /mnt/pmem/tier1 and a second one on /dev/shm/tier0

igchor avatar Jul 19 '22 08:07 igchor

Thanks for clarifying. Is it correct that in the future it will be possible for the same shmManager to mount different types of segments (tier 1 is file, tier 0 is posix)?

haowu14 avatar Jul 19 '22 17:07 haowu14

Thanks for clarifying. Is it correct that in the future it will be possible for the same shmManager to mount different types of segments (tier 1 is file, tier 0 is posix)?

Yes, the changes were done with this flexibility in mind. The Shm segments can be Posix, SysV or File and can be tiered in any order.

guptask avatar Jul 19 '22 19:07 guptask

Hi! Thank you for making the change. We discussed internally on how we can bring FileShmSegment into cachelib and allow compatibility. We'd like to suggest the following change (your PR already have most of them, I just listed them down here so that we have a full picture):

Breaking this change into two PRs:

The first PR

  • Create a class containing Id and Type of a segment called ShmTypeOpts. The type is an enum. The ID is a string. Put ShmTypeOpts into ShmSegmentOpts.
  • ShmBase is initialized solely with ShmSegmentOpts since name_ can now be initialized with the ID in ShmSegmentOpts.ShmTypeOpts.
  • Posix/SysVShmSegment class (subclasses of ShmBase) will be new-ed/attached without an extra name input.
  • ShmSegment class (the class that wraps a ShmBase) will be new-ed/attached without name or usePosix because that can be retrieved with ShmSegmentOpts.ShmTypeOpts).
  • In ShmManager, we keep a map from name to ShmTypeOpts called nameToType_ (replacing nameToKey_, but we will need to keep nameToKey_ for compatibility reason).
  • When creating a new segment via ShmManager, we expect a name and ShmTypeOpts to be provided. If the type indicates SysV or Posix, we ignore the id field in ShmTypeOpts and fill the id field with uniqueIdForName. We use this ShmTypeOpts to create the segment then save the segment into segments_ and save the ShmTypeOpts to nameToType_.
  • When attaching/removing a segment via ShmManager, we expect only the name provided. ShmManager looks up the segment and the ShmTypeOpts from nameToType_ and try to attach/remove with the ShmTypeOpts retrieved.
  • When removing a segment via the static function of ShmManager, we expect ShmTypeOpts to be provided directly.

After the first diff, the usePosix_ field inside ShmManager will be noop since each segment know what type they are from nameToType_.

The second PR

Introduce FileShmSegment. Most of the logic wouldn't require change except that we won't overwrite to the id field in ShmTypeOpts if the type indicates file. FileShmSegments will interpret the id in ShmTypeOpts as the path of the file to access (just like PosixShmSegment will take the id and write to /dev/shm/id).

Does the above change satisfy the need to introduce FileShmSegments? This is mostly aligned with what you have, other than making ShmTypeOpts something uniform for all the types.

If it looks good, there's some additional comments regarding compatibility. To avoid dropping the cache when this new code is deployed, we suggest the following: In the first PR, increment the kCacheLibVersion in cachelib/allocator/CacheVersion.h. Still keep nameToOps_ map and update it together when we update nameToType_. Still keep the usePosix_ field. When updating the thrift file, add one more field instead of modify the existing nameToKey_.

After that lands, CacheLib team will make a clean up diff to remove the duplicated logic for compatibility.

Again thank you for making this change.

haowu14 avatar Jul 20 '22 17:07 haowu14

Hi! Thank you for making the change. We discussed internally on how we can bring FileShmSegment into cachelib and allow compatibility. We'd like to suggest the following change (your PR already have most of them, I just listed them down here so that we have a full picture):

Breaking this change into two PRs:

The first PR

  • Create a class containing Id and Type of a segment called ShmTypeOpts. The type is an enum. The ID is a string. Put ShmTypeOpts into ShmSegmentOpts.
  • ShmBase is initialized solely with ShmSegmentOpts since name_ can now be initialized with the ID in ShmSegmentOpts.ShmTypeOpts.
  • Posix/SysVShmSegment class (subclasses of ShmBase) will be new-ed/attached without an extra name input.
  • ShmSegment class (the class that wraps a ShmBase) will be new-ed/attached without name or usePosix because that can be retrieved with ShmSegmentOpts.ShmTypeOpts).
  • In ShmManager, we keep a map from name to ShmTypeOpts called nameToType_ (replacing nameToKey_, but we will need to keep nameToKey_ for compatibility reason).
  • When creating a new segment via ShmManager, we expect a name and ShmTypeOpts to be provided. If the type indicates SysV or Posix, we ignore the id field in ShmTypeOpts and fill the id field with uniqueIdForName. We use this ShmTypeOpts to create the segment then save the segment into segments_ and save the ShmTypeOpts to nameToType_.
  • When attaching/removing a segment via ShmManager, we expect only the name provided. ShmManager looks up the segment and the ShmTypeOpts from nameToType_ and try to attach/remove with the ShmTypeOpts retrieved.
  • When removing a segment via the static function of ShmManager, we expect ShmTypeOpts to be provided directly.

After the first diff, the usePosix_ field inside ShmManager will be noop since each segment know what type they are from nameToType_.

The second PR

Introduce FileShmSegment. Most of the logic wouldn't require change except that we won't overwrite to the id field in ShmTypeOpts if the type indicates file. FileShmSegments will interpret the id in ShmTypeOpts as the path of the file to access (just like PosixShmSegment will take the id and write to /dev/shm/id).

Does the above change satisfy the need to introduce FileShmSegments? This is mostly aligned with what you have, other than making ShmTypeOpts something uniform for all the types.

If it looks good, there's some additional comments regarding compatibility. To avoid dropping the cache when this new code is deployed, we suggest the following: In the first PR, increment the kCacheLibVersion in cachelib/allocator/CacheVersion.h. Still keep nameToOps_ map and update it together when we update nameToType_. Still keep the usePosix_ field. When updating the thrift file, add one more field instead of modify the existing nameToKey_.

After that lands, CacheLib team will make a clean up diff to remove the duplicated logic for compatibility.

Again thank you for making this change.

Thanks for the detailed breakdown. I'm working on refactoring the existing PR into two separate PRs based on the list you have chalked out.

guptask avatar Jul 26 '22 15:07 guptask

Closing this PR due to priority re-alignment. NUMA support for CXL is now the primary focus in memory-tiering.

guptask avatar Sep 26 '22 16:09 guptask