mojo
mojo copied to clipboard
[stdlib] Add `Dict._resize_down` and `_under_load_factor()`
Hello,
this is a PR to solve a problem mentioned by @bethebunny (in #3128)
Good little feature to have,
it probably should not be a default and be parametrized,
because of the cost in performance of halving memory.
💡 for the Dictionary that not down-resize
Did an implementation of it, maybe you'll like it !
It really grows and ungrows dynamically,
at each pop
, if self.size < 1/3 of self._reserved()
,
the Dict
halves it's memory with a smaller list capacity.
(so the RAM usage is dynamic and change on the size)
When pop is used on all elements of a dict of 256 elements,
the transition goes from 512, 256, 128, 64, 32, 16, 8, 4
That way, it doubles on 2/3 and halves on 1/3
The ratio is quite simple and it seem to do the job:
(same as for resizing up (below 1/3 of self._reserved()
))
0.33 * 256 == 84.48 <- we are below (we can resize down) 0.66 * 128 == 84.48 <- we are below (no need to resize up again)
0.66 * 256 == 168.96 <- we are above (we can resize up) 0.33 * 512 == 168.96 < we are above (no need to resize down again)
Example of a Dict
cycling trough grow and ungrow:
from time import now
def main():
var result: Int=0
var dict_size = 256
var start = now()
var stop = now()
result = 0
small2 = Dict[Int,Int]()
start = now()
for y in range(2):
for i in range(dict_size):
# grow the dictionary up to its full size
print("before",small2._reserved(), len(small2))
small2[i]=i
print("after",small2._reserved(), len(small2))
for i in range(dict_size):
# ungrow the dictionary
print("before",small2._reserved(), len(small2))
result += small2.pop(i)
print("after",small2._reserved(), len(small2))
stop = now()
print(stop-start, result)
The result is correct (65280
):
x = 0
for y in range(2):
for i in range(256):
x+=i
print(x) #65280
Maybe that kind feature need to be parametrized, so users can choose? (resizing down reduce the performance, so may not be a good default)
-
DictConfig
struct, as defaultDict
parameter is a possibility- we might end up in a parametrization challenge later
- another
Dict
type is also a possibility (BaloonDictionary
like) ?
I suggest the second one would be better, but struggle for naming here.
If reviewers likes it, maybe we can bring it to life with more reviews ?
(There are more methods where the logic need to be done, and tests)
Note that the hard work have been done by people working on Dict
,
this pr just add a halving part, which use the same ratio as the doubling.