PyMatching icon indicating copy to clipboard operation
PyMatching copied to clipboard

Support decoding from bit packed detection event data in memory and on disk

Open Strilanc opened this issue 3 years ago • 3 comments

When I produce a lot of detection event data using stim and store it in a numpy array, I can ask for it to be bit packed so that it uses 1 bit per detector instead of 1 byte per detector. Also, I can combine multiple shots into one array. But when I go to give this array to pymatching I need to feed it in one shot at a time and I need to unpack the bits:

    # det_data is a uint8 numpy array of shape (num_shots, math.ceil(num_dets / 8))
    predictions = np.zeros(shape=(num_shots, num_obs), dtype=np.bool8)
    for k in range(num_shots):
        expanded_det = np.unpackbits(bit_packed_det_samples[k], count=num_dets, bitorder='little')
        expanded_det = np.resize(expanded_det, num_dets + 1)
        expanded_det[-1] = 0
        predictions[k] = matching_graph.decode(expanded_det)

It should be possible to instead do, for example, this:

    # det_data is a uint8 numpy array of shape (num_shots, math.ceil(num_dets / 8))
    predictions = matching_graph.decode_bit_packed_batch(det_data)

and I don't just mean that the internal implementation does what the above python code is doing. It should stay bit packed from end to end, and avoid initialization overhead between shots.

It may also be useful to have a method that you give input and output filepaths to, so that the C++ code can run off data on disk instead of involving data stored by python at all.

Strilanc avatar May 24 '22 17:05 Strilanc

I agree it would be good to add this

oscarhiggott avatar May 24 '22 17:05 oscarhiggott

Fixed in dev branch

Strilanc avatar Oct 22 '22 23:10 Strilanc

Although I agree this is fixed with the protected main method for sinter, I also plan to add pymatching.Matching.decode_bit_packed_batch(det_data) at some point, so reopening this one

oscarhiggott avatar Oct 22 '22 23:10 oscarhiggott