Tom
Tom
I may add `__index_in_shard__` metadata to the samples; when that is 0, you then know that the last shard was fully consumed. The new "wids" library already has this, and,...
@Lyken17 Tarfile nominally can perform random access via getmember, but if you use this with a gzipped file, it will simply reread the entire gzipped file from the beginning again...
If you want to use WIDS, then your best bet is uncompressed .tar files, but you can compress the contents. If you have lots of tiny files (e.g. Twitter messages),...
A couple more comments: The SAM datasets are not in order, meaning that corresponding .jpg and .json files are not necessarily adjacent in those tar files; WebDataset requires them to...
What you describe should work in principle. Is it possible that you are using non-ASCII file names/keys inside the tar archive? As you can tell, mmtar tries to decode file...
There probably isn't much error handling for this case because when shards are missing, we usually just terminate the job. If you want to continue loading in the presence of...
I'm not trying to use it on other kernels. The problem is that merely installing it results in spurious warnings on other kernels, making jupyter-autopep8 pretty much incompatible with any...
I can't reproduce this. In any case, I have updated tarp to use mpio always, so the dependencies should be correct now no matter what. Please give it another try.
I've updated sort so that if you don't specify any fields, it uses all fields. I can't reproduce the "no output" problem you're seeing. Can you create and upload a...
For me, this would be most useful for captive nut pockets and other kinds of smaller holes; those currently require a disproportionate amount of design and cleaning. It would be...