Erjia Guan

Results 170 comments of Erjia Guan

Let's keep it as it is. This can be treated as an overall tracker. Those three are sub issues.

> LGTM! Question: do we expect any case where the `datapipe` being passed in already has a wrapper? It's more like a future proof. (Technically, if users manually `deserialize` the...

> Is there a way to modify the `@functional_datapipe` decorator code at https://github.com/pytorch/pytorch/blob/664058fa83f1d8eede5d66418abff6e20bd76ca8/torch/utils/data/datapipes/_decorator.py#L11-L38 to include the docstring? I think [`functools.wraps`](https://docs.python.org/3/library/functools.html#functools.wraps) is able to do it for Python functions, but not...

I think it might be doable by adding a custom wrapper around those `functional` API and it would copy or reference the document from class by implementing a custom `__doc__`...

> `RuntimeError: Failed to open the input "StreamWrapper" (Invalid data found when processing input).` Based on the traceback, I think it's about how does `torchaudio` expect the input type. It...

> It gets trickier when you are restoring the buffers of container DataPipes with children (e.g. `demux`, `fork`). Sometimes one child may reach `StopIteration` and the other child hasn't. Yeah,...

If you call `dl.shutdown()` at the end, is the problem still persistent?

What we might be able to do is to register all cleanup functions to 'atexit'. This technically should guarantee cleanup functions called before Python exits.

Thank you for opening the issue. The reason that `demux`'s buffer blows up because we will yield cached data first then `todo`. See: https://github.com/pytorch/data/blob/983e87ada583b7a58d13a1a5f047dd9d256155dd/torchdata/datapipes/iter/util/cacheholder.py#L424 However, it seems weird to me...

> Personally I think I would expect caching to be FIFO with respect to the source datapipe, in order to be as deterministic as possible. Agree. Will take a look...