streaming icon indicating copy to clipboard operation
streaming copied to clipboard

example requested: debug individual shard for poison pills

Open mooreniemi opened this issue 10 months ago • 5 comments

I sometimes hit "poison pills" inside an MDS dataset. Is there documentation on how to load just one shard and traverse to the pill, without the load mechanisms that might throw on error?

Eg. I get an error like:

IndexError: Relative sample index 85 is not present in the 17/shard.00085.mds file.

I'd like to be able to easily "peek" at this data without having to open the entire dataset. (I'm also not clear how to translate that error to the absolute position of my sample in the entire dataset.)

Sorry if I missed this in existing docs.

mooreniemi avatar Aug 25 '23 19:08 mooreniemi