zee_alloc icon indicating copy to clipboard operation
zee_alloc copied to clipboard

Packed doubly-linked-list

Open fengb opened this issue 6 years ago • 4 comments

Once we have https://github.com/fengb/zee_alloc/issues/3, we have a really nice characteristic:

  • Small allocations are always align(16). The extra bits can denote the small size bitshift... which incidentally is less than 16.
  • Large allocations are always align(32K) or align(64K). The extra bits can denote the total pages, which maps to 2GB or 4GB respectively.

With this type of packing, we only need a usize word (4B) for the metadata. Thus current minimum payload is expanded to 12B.

~We might be able to support 4B payloads — it allows for align(8) which is not enough space to store the length, but we can possibly assume any align(8) has to be 4B allocations.~ Inconsistent metadata is very hard to detect so we'll stick with 12B minimum payloads.

fengb avatar Jun 28 '19 16:06 fengb

I've been mixing up payload alignment and frame alignment. Consumers only care about payload alignment, but frame align = payload align - metasize

In order for allocations to be properly aligned, we have to stick with 4B or 8B payload. Therefore, it currently doesn't make any sense to pack the metadata.

However, converting to packed doubly-linked-list allows us to carve out space to squeeze in the length so it'd cost 0 bytes.

fengb avatar Aug 07 '19 20:08 fengb

There are some major disadvantages:

  1. Less friendly debugging. Pointers in memory will no longer look like proper pointers.
  2. Fewer redundancies. We have some rudimentary error checking to prevent memory corruption: pointer alignment and length, and neither will be usable going forward.

I think the tradeoffs are worth it. Long-term fragmentation is goal 3 and fast free() is goal 4, while debugging is not a real goal.

fengb avatar Aug 08 '19 02:08 fengb

This can be in tandem with converting to doubly-linked-list:

  1. Frames are at worst 8B aligned on usize=32 machines. Add usize=32 as a compile assertion.
  2. Doubly-linked-list has prev/next nodes. Squeeze an extra u3 in each.
  3. With a synthetic u6 (u3+u3), we can reference the size index value for small allocations and size multiplier for jumbo

fengb avatar Aug 08 '19 14:08 fengb

This adds a depressing 1.5K (+60%) to release-fast and 4K (+100%) to release-safe. I can’t see this as being effective even with major hand tuning.

fengb avatar Aug 10 '19 20:08 fengb