cpython icon indicating copy to clipboard operation
cpython copied to clipboard

C OrderedDict's `od_fast_nodes` allocation might be 50% larger than necessary?

Open masklinn opened this issue 1 year ago • 0 comments

Feature or enhancement

Proposal:

Reduce the allocation size of od_fast_nodes_size from 1<<dk_log2_size to USABLE_FRACTION(1<<dk_log2_size)

Justification:

By my understanding, _odictobject uses the "front end" of a regular PyDictObject for its hashmap of _odictnode, so it only provides an entries array. However it allocates 1<<dk_log2_size entries: https://github.com/python/cpython/blob/b9cb855621c0813846421e4ced97260a32e57063/Objects/odictobject.c#L559 when dict allocates USABLE_FRACTION(1<<dk_log2_size) entries: https://github.com/python/cpython/blob/b9cb855621c0813846421e4ced97260a32e57063/Objects/dictobject.c#L750

That is a savings of about 1 / 3 (1 << dk_log2_size), which by my reckoning is high single-digit percent: ignoring the fixed overhead an odict is

  • dk_size * index_size bytes for the sparse array
  • 2 or 3 * 8 * USABLE_FRACTION(dk_size) for the dense array depending on DICT_KEYS (UNICODE or GENERAL)
  • od_fast_nodes, currently 8 * dk_size but I think could be 8 * USABLE_FRACTION(dk_size)
  • 8 * 4 * USABLE_FRACTION(dk_size) / 2 to 8 * 4 * USABLE_FRACTION(dk_size) for the actual _odictnode

Taking USABLE_FRACTION = 2/3, according to wolfram, this simplifies to savings of 8 / (3 * idx_size + 2 * dict_keys + 24 + 32 * (1 + dk_fill)) where

  • idx_size is the size of indexes (1, 2, 4, 8)
  • dict_keys is 16 or 24
  • dk_fill is the fill level between 0 (USABLE_FRACTION(dk_size) / 2 items) and 1 (USABLE_FRACTION(dk_size) items)

In the worst case (8, 24, 1) that's 5%, in the best case (1, 16, 0) that's 9%. This might be a touch on the optimistic side as again it does not take static overhead in account.

As far as I can tell, this just requires adjusting the two computations of od_fast_nodes_size by USABLE_FRACTION.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

masklinn avatar Mar 20 '24 19:03 masklinn