swift-collections icon indicating copy to clipboard operation
swift-collections copied to clipboard

Rationale for storing level and offset in heap nodes?

Open dabrahams opened this issue 11 months ago • 6 comments

Not really a bug report, but that feature of the implementation surprised me. A traditional heap stores its elements directly, and as far as I know no heap operations depend on these values.

dabrahams avatar Dec 24 '24 20:12 dabrahams

Heap uses a min-max heap, and IIRC I needed quick access to the level corresponding to an arbitrary index, so that it can decide if it's on a min- or max-level. (A flag would likely also suffice, and the value for the end of the heap ought to be cached so that insert wouldn't need to recover it from scratch every time.)

lorentey avatar May 16 '25 16:05 lorentey

You do realize you can compute the level from the index, right?

dabrahams avatar May 16 '25 16:05 dabrahams

Yes; that's how the level field gets initialized given an arbitrary index.

Maintaining the level alongside the index makes it easy to incrementally update it while traversing the tree, without having to pay for calculating it from scratch on every level, like the original incarnation of this code used to do. Bit counting is not expensive in absolute terms, but doing that was still measurably slower than this version.

See https://github.com/apple/swift-collections/issues/71; it outlines some obvious lessons to apply when this code becomes important enough that addressing them gets on my todo list.

Unfortunately, the original goal of putting the levels in the index (a.k.a. _Node) structure wasn't a straightforward win. For trickleDown, the level never needs to be accessed, but it still needs to be maintained, which is annoying. (Luckily the compiler gets rid of the unused calculations, so there is no performance impact; but readability suffers a bit.)

An alternative option would've been to remove level from _Node and simply maintain it in a separate variable as needed. I don't think it's worth implementing this, though.

lorentey avatar May 16 '25 17:05 lorentey

Bit counting is not expensive in absolute terms, but doing that was still measurably slower than this version.

Counting the leading zero bits is one instruction; then you subtract from the bit width. You're telling me memory accesses and the locality costs of extra storage is cheaper? I must be missing something!

I see #71 says that you were using the one instruction. Counterintuitive I must say.

dabrahams avatar May 17 '25 00:05 dabrahams

Yes -- counting bits is pretty cheap, but it was still measurably better to have a register hold an incrementally updated level. Saving an integer multiplication's worth of cycles per level touched is still a win: the reference baseline was a regular min-heap that does not need to care about any of this level business.

There are usually no memory accesses or significant locality costs here; _HeapNode is materialized on demand, for the duration of an operation. That said, it may be that adding a word to the heap struct that caches the current depth of the tree may be beneficial to insert; confirming/rejecting that (and other hypotheses) are left to a future performance pass.

lorentey avatar May 17 '25 02:05 lorentey

Oh that's a very interesting design; thanks for the detail!

dabrahams avatar May 17 '25 16:05 dabrahams