Sylvain Gugger

Results 633 comments of Sylvain Gugger

The same comment as above is still true.

Hi there! 1. You can do whatever you want since Accelerate will adapt to your training loop :-) 2. This is completely untested, so I can't guarantee it will work....

@jianguoz It's not a priority for now, as we have no mean of testing the solution (our request to get access to a free small TPU pod to maintain Accelerate...

No one is working on it for now, so if you want to tackle this, feel free to give it a go!

That's because PyTorch does not let you load an individual weight from a state dict because they pickle the whole thing.

There is no example yet, if you want to contribute one, by all means :-)

The model is used on our side on the inference API with the same options and without any memory leak, so I suspect the memory leak comes from somewhere else...