nobrainer
nobrainer copied to clipboard

Published 20 hours ago •

Reame
Issues

refactor code to calculate records per shard using n_volumes and number of shards

Open hvgazula opened this issue 10 months ago • 2 comments

https://github.com/neuronets/nobrainer/blob/976691d685824fd4bba836498abea4184cffd798/nobrainer/dataset.py#L115-L122

If the number of volumes in the shard is too large, this snippet of code can be time-consuming. Alternatives are

use a combination of n_volumes and number of files with file_pattern to calculate len(first_shard)
provide metadata (number of volumes in the shard) as well as total number of volumes in the dataset

Apr 20 '24 03:04 hvgazula