data
data copied to clipboard
[DataPipe] DataPipe Deprecation Tracker
We have a number of DataPipes that are being deprecated. Our general policy is that we first mark the DataPipe as deprecated with a warning, and wait at least one release cycle (~3 months) before removing it. Note that some DataPipes will be removed from the PyTorch Core library but will remain in TorchData, and some others are renamed.
Status Types:
- Deprecated - marked as deprecated with a warning
- Removed - removed from repository
DataLoader2
Tracker
Name | Deprecation Date | Status | Earliest Removal Version |
---|---|---|---|
PrototypeMultiProcessingReadingService -> MultiProcessingReadingService |
0.6 | Deprecated | 0.8 |
IterDataPipe
Tracker
Name | Functional API | Module | Deprecation Date | Status | Earliest Removal Version |
---|---|---|---|---|---|
BucketBatcher | NA | Core | Sep 30th, 2021 | Removed (moved to TorchData) | |
HTTPReader | NA | Core | Sep 30th, 2021 | Removed (moved to TorchData) | |
LineReader | NA | Core | Sep 30th, 2021 | Removed (moved to TorchData) | |
TarArchiveReader | NA | Core | Sep 30th, 2021 | Removed (moved to TorchData) | |
ZipArchiveReader | NA | Core | Sep 30th, 2021 | Removed (moved to TorchData) | |
FileLoader | NA | Core | Jan 5th, 2022 | Removed (use FileOpener) | 1.13 (Sept 2022) |
FileLoader | NA | Data | Jan 5th, 2022 | Removed (use FileOpener) | |
IoPathFileLoader | load_file_by_iopath | Data | Jan 5th, 2022 | Removed (use IoPathFileOpener) | |
RoutedDecoder | routed_decode | Core | Jan 10th, 2022 | Deprecated | 1.13 (Sept 2022) |
TarArchiveReader | read_from_tar | Data | Feb 22th, 2022 | Removed (use TarArchiveLoader) | 0.5 (Sept 2022) |
XzFileReader | read_from_xz | Data | Feb 22th, 2022 | Removed (use XzFileLoader) | 0.5 (Sept 2022) |
ZipArchiveReader | read_from_zip | Data | Feb 22th, 2022 | Removed (use ZipArchiveLoader) | 0.5 (Sept 2022) |
Filter | filter | Core | 1.12 | Removed argument (drop_empty_batches) | 2.0 (Nov 2022) |
FSSpecFileOpener | open_files_by_fsspec | Data | 0.4 | open_file_by_fsspec is Removed |
0.6 (Nov 2022) |
IoPathFileOpener | open_files_by_fsspec | Data | 0.4 | open_file_by_iopath is Removed |
0.6 (Nov 2022) |
MapDataPipe
Tracker
Nothing for now
cc: @ejguan @VitalyFedyunin @NivekT
For TarArchiveReader
, should we add a deprecation warning in main branch as 0.3.0 branch cut has been finished.
Another Misc tracker:
Name | Module | Deprecation Version | Status | Earliest Removal Version |
---|---|---|---|---|
torch.utils.data.graph.traverse | Core | 1.13 | Deprecating | 1.15 / 2.1 |
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
@BlueskyFR
IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a demux
based on file types then decode each datapipe correspondingly then mux
them together. Glad to hear your use case.
I see RoutedDecoder has been marked as deprecated: what is it going to be replaced by?
@BlueskyFR IIRC, we plan to remove this DataPipe in the future. The general reason is that we think this can be easily achieved by using a
demux
based on file types then decode each datapipe correspondingly thenmux
them together. Glad to hear your use case.
I don't understand: how should I proceed to decode a PNG image in the current state then?
You can use a map function like datapipe.map(decode_fn)
to decode the PNG image
You can use a map function like
datapipe.map(decode_fn)
to decode the PNG image
Okay, but why was support for decoding dropped then?
Okay, but why was support for decoding dropped then?
decoding
didn't do more things like a map
function, except we provided a few decoding functions for convenient. And, in order to support routed_decode
, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes the routed_decode
more complicated and redundant. For example of your use case (decoding PNG), the routed_decode
would add more decoding handlers such as json
, pickle
, etc. into this DataPipe.
As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism
Okay, but why was support for decoding dropped then?
decoding
didn't do more things like amap
function, except we provided a few decoding functions for convenient. And, in order to supportrouted_decode
, we need to add lots of decoding functions to cover the general file decoding, which is not sustainable for us to maintain and it makes therouted_decode
more complicated and redundant. For example of your use case (decoding PNG), therouted_decode
would add more decoding handlers such asjson
,pickle
, etc. into this DataPipe.As, TorchData provides composable way to construct pipeline, users should be able to create a pipeline to handle specific decoding mechanism
Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed
Okay. What is the preferred mechanism to decode images? Ideally I think it should be done in batches if performance is needed
It depends on if your decode_fn
supports batched decoding in high performance (multithreading). Otherwise, I think it's going to be similar to do decoding per image.