C OrderedDict's `od_fast_nodes` allocation might be 50% larger than necessary?
Feature or enhancement
Proposal:
Reduce the allocation size of od_fast_nodes_size from 1<<dk_log2_size to USABLE_FRACTION(1<<dk_log2_size)
Justification:
By my understanding, _odictobject uses the "front end" of a regular PyDictObject for its hashmap of _odictnode, so it only provides an entries array. However it allocates 1<<dk_log2_size entries: https://github.com/python/cpython/blob/b9cb855621c0813846421e4ced97260a32e57063/Objects/odictobject.c#L559 when dict allocates USABLE_FRACTION(1<<dk_log2_size) entries: https://github.com/python/cpython/blob/b9cb855621c0813846421e4ced97260a32e57063/Objects/dictobject.c#L750
That is a savings of about 1 / 3 (1 << dk_log2_size), which by my reckoning is high single-digit percent: ignoring the fixed overhead an odict is
-
dk_size * index_sizebytes for the sparse array - 2 or 3
* 8 * USABLE_FRACTION(dk_size)for the dense array depending on DICT_KEYS (UNICODE or GENERAL) -
od_fast_nodes, currently8 * dk_sizebut I think could be8 * USABLE_FRACTION(dk_size) -
8 * 4 * USABLE_FRACTION(dk_size) / 2to8 * 4 * USABLE_FRACTION(dk_size)for the actual _odictnode
Taking USABLE_FRACTION = 2/3, according to wolfram, this simplifies to savings of 8 / (3 * idx_size + 2 * dict_keys + 24 + 32 * (1 + dk_fill)) where
- idx_size is the size of indexes (1, 2, 4, 8)
- dict_keys is 16 or 24
- dk_fill is the fill level between 0 (
USABLE_FRACTION(dk_size) / 2items) and 1 (USABLE_FRACTION(dk_size)items)
In the worst case (8, 24, 1) that's 5%, in the best case (1, 16, 0) that's 9%. This might be a touch on the optimistic side as again it does not take static overhead in account.
As far as I can tell, this just requires adjusting the two computations of od_fast_nodes_size by USABLE_FRACTION.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response