Benjamin Bossan
Benjamin Bossan
Hmm, I can't reproduce this. Here is the script that I used: ```python import pickle import sys import numpy as np import torch from sklearn.datasets import make_classification from torch import...
Thanks for the reproducer. I could verify that this fails at loading the model on a CPU machine. I tried to debug a little bit and it appears that when...
Thanks for providing further information. Without digging deeper: When pickling, skorch checks attributes with a CUDA-dependency, pops them from the pickle state, and saves them in a way that allows...
Thanks for the quick response. It can indeed be a bit confusing on what axis the DoRA scaling should be applied, especially with the transpose operation that's implicit in the...
I agree that breaking the existing method is not a good idea. Whether adding a new option to use the other axis is worth it, I don't know. I dug...
Maybe @nbasyl can comment on the notation and if it would make sense to have an option to swap the axis.
I have very little experience with google colab or XLA, but to me this looks like a PyTorch-XLA error and not something specific to accelerate notebook launcher or even accelerate...
Thanks for testing again. I agree that it's strange that the errors are random and that this could be caused by a race condition. I asked internally if there is...
Do you really need to call `prepare` multiple times? You should be able to run `prepare` in a single call, right? ```python return_values = self.accelerator.prepare(*accelerator_to_prepare.values()) for k, val in zip(accelerator_to_prepare.keys(),...
The deepspeed init logic is probably not easy to fix, but I'll wait for Zach's return to comment on that. Regarding the docs, yes, probably it should be highlighted that...