Chris Hughes
Chris Hughes
Purely from an end user point of view, another look at how distributed training is implemented. Consider the situation in which we have a training script `train.py` which trains a...
Hi @sgugger, was there a reason that this was closed and not merged? I came across this when I was about to request the same thing.
@NouamaneTazi These are really interesting insights! If you are looking at inspecting signatures, do you think it would be too much complexity to set non-blocking automatically based on whether it...
> @Chris-hughes10 CPU -> GPU and GPU -> CPU both lead to the same issues as mentioned above. Only GPU -> GPU is the safe operation but as I said...
Hey @muellerzr, no worries at all, glad that you are interested. For me, it would be better if you didn't have to use the context manager at all, and it...
I agree that the small wrapper would be easier to implement to begin with, and I would be happy with this as a starting point, but I also think that...
It wouldn't support sharded datasets either right? As that will still pad or drop even if it isn't dispatched?
I am aware that there are a lot of parts here for what seems a small change xD. But, in my opinion, it would be very confusing to add this...
Hey @sgugger, I completely appreciate where you are coming from, I am just genuinely curious about which parts of the library would be impacted by having an unequal number of...