ray_shuffling_data_loader issues

Code for examples and repro-ing issues

Signed-off-by: Richard Liaw

Add unit tests for the following: 1. Shuffler (shuffle.py) 2. `BatchQueue` (batch_queue.py) 3. `ShufflingDataset` (dataset.py) 4. `TorchShufflingDataset` (torch_dataset.py) ### Implementation Notes - `ShufflingDataset` and `TorchShufflingDataset` testing could initially be as...

clarkzinzow

[Docs] Add contributing README

Add a contributing README to repo.

clarkzinzow

documentation

[Shuffle] Support num_mappers != num_files

Right now, the number of shuffle mappers is equal to the number of input files, with each mapper reading a single input file. However: 1. For very large files, we...

clarkzinzow

enhancement

[feature] support TensorFlow dataset binding in ray_data_loader

2

We need to build a connector to TF dataset iterator. Impl idea from @clarkzinzow: We’d take the base shuffling dataset, create a ShufflingTFDataset that converts each batch dataframe to feature...

oliverhu

good first issue

[Docs] Add user guide.

Add user guide to docs. This should include: - [ ] Walkthrough of installation, setup, and example. - [ ] Cataloguing of all configuration details, e.g. what to set `num_reducers`...

clarkzinzow

documentation

Cluster config: Update instance type to smaller node, preallocate Plasma memory, revert to single-node until preallocation is validated.

1

clarkzinzow

ray_shuffling_data_loader
ray_shuffling_data_loader copied to clipboard

Metadata

Code for examples and repro-ing issues

Unit tests for shuffler

[Testing] Add unit tests.

[Docs] Add contributing README

[Shuffle] Support num_mappers != num_files

[feature] support TensorFlow dataset binding in ray_data_loader

[Docs] Add user guide.

Cluster config: Update instance type to smaller node, preallocate Plasma memory, revert to single-node until preallocation is validated.

← Metadata

Owner

Metadata

ray_shuffling_data_loader ray_shuffling_data_loader copied to clipboard

Metadata

← Metadata

Owner

Metadata

ray_shuffling_data_loader
ray_shuffling_data_loader copied to clipboard