nobrainer icon indicating copy to clipboard operation
nobrainer copied to clipboard

refactor code to calculate records per shard using n_volumes and number of shards

Open hvgazula opened this issue 10 months ago • 2 comments

https://github.com/neuronets/nobrainer/blob/976691d685824fd4bba836498abea4184cffd798/nobrainer/dataset.py#L115-L122

If the number of volumes in the shard is too large, this snippet of code can be time-consuming. Alternatives are

  • use a combination of n_volumes and number of files with file_pattern to calculate len(first_shard)
  • provide metadata (number of volumes in the shard) as well as total number of volumes in the dataset

hvgazula avatar Apr 20 '24 03:04 hvgazula