singa icon indicating copy to clipboard operation
singa copied to clipboard

Add a dataset module

Open nudles opened this issue 4 years ago • 2 comments

Data loading is an important part of DL training, which could be slow and become a bottleneck if not implemented well. The tasks include

  1. implement dataset classes for common benchmark datasets to make them easy to access within SINGA (e.g., without manual downloading).
  2. implement common preprocessing operations
  3. implement parallel data loading for higher efficiency

nudles avatar May 17 '20 08:05 nudles

Code from the data module may be reused. https://github.com/apache/singa/blob/master/python/singa/data.py

nudles avatar May 17 '20 09:05 nudles

And https://github.com/apache/singa/blob/master/python/singa/image_tool.py

nudles avatar May 17 '20 09:05 nudles