cachey
cachey copied to clipboard
recursive for list and tuple seems like a safe default
This seems like a sensible change to me. Would you be willing to add a test as well?
@mrocklin I would, but I just noticed that my nbytes are different from the doctests. Maybe the numbers are from a 32-bit system? I wouldn't be very comfortable adding untested values as a test, that doesn't sound right :-)
In my case:
>>> nbytes(123)
28
>>> nbytes([123, 123])
56
Which made me realize the list itself is not being counted.
I did some experiments, and it seems that a list/tuple's size also depends on the number of elements in it. I'll make another change to the code.
You can make tests that are 32/64 bit invariant, for example you can test something like the following:
assert nbytes([x, x, x]) == nbytes(x) * 3 + nbytes([])
Where x in a numpy array
Also... in case the elements are referring to the same object, it is going to be an overestimation.
As I've learned the hard way, in your example, you still would have to add the overhead of the pointer to the object in the list (usually 4 bytes per additional element in the list).
It's pretty difficult to write a test for it, as the actual value is not static w.r.t. to the different python versions :)
I think to get an even better estimate we'd have to consider the ids of the objects and make sure we count them only once (though I guess we could assume no duplicate objects).
I think that for our purposes overestimating or approximations are fine.
On Sat, Sep 30, 2017 at 4:52 PM, Pascal van Kooten <[email protected]
wrote:
It's pretty difficult to write a test for it, as the actual value is not static w.r.t. to the different python versions :)
I think to get an even better estimate we'd have to consider the ids of the objects and make sure we count them only once (though I guess we could assume no duplicate objects).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/cachey/pull/9#issuecomment-333335277, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszLL0hzlmRS4nbVqn5oyfaV14ytD6ks5snqoegaJpZM4Ppq7T .