Karl Higley
Karl Higley
Thinking about it, I wonder if having a `dask` sub-package outside of `utils` would make sense? Maybe the thing to do is migrate parts of `io` that are for Dask...
I don't have a clear line in mind, but having looked at what's in `io`, it seemed like `DaskSubgraph`, `DataFrameIter`, and `shuffle.py` are pure Dask code that might be applicable...
I think this is because `repartition` creates a brand new `Dataset` object which then tries to infer a schema from the raw data all over again, but it shouldn't be...
I think @sararb fixed this issue in #192
I think core might be the only oddball repo without them: https://github.com/NVIDIA-Merlin/systems/tree/main/.github/ISSUE_TEMPLATE