boost-histogram icon indicating copy to clipboard operation
boost-histogram copied to clipboard

Pickling a histogram creates an extra copy of the array in memory

Open bendavid opened this issue 2 years ago • 1 comments

Pickling a large boost histogram results in a second copy being created in memory. Via the use of the PickleBuffer mechanism it should be possible to avoid this.

e.g.

import boost_histogram as bh

axis = bh.axis.Regular(1024*1024*1024, 0., 1)
htest = bh.Histogram(axis)

uses 8GB of memory as expected

import boost_histogram as bh
import pickle

axis = bh.axis.Regular(1024*1024*1024, 0., 1)
htest = bh.Histogram(axis)

with open("test.pkl", "wb") as f:
    pickle.dump(htest, f, protocol = pickle.HIGHEST_PROTOCOL)

uses 16GB

bendavid avatar Feb 13 '23 17:02 bendavid

I think we'd need support in pybind11 for this (could be wrong, but I think so). That would mean it needs a CAPI interface, which I would guess it has?

henryiii avatar Aug 30 '23 15:08 henryiii