Wing Lian

Results 103 comments of Wing Lian

Did you try upgrading or downgrading bitsandbytes? On Thu, Feb 1, 2024 at 3:11 PM Daniel Chalef ***@***.***> wrote: > Please check that this issue hasn't been reported before. >...

Thanks @younesbelkada! I'll open up another PR with just the validation and training args pieces and wait for the upstream integration. Much appreciated!

Superseded by #1409. Thanks for getting this rolling @maximegmd. Props to @younesbelkada for getting this working upstream in transformers.

@dctanner you somehow had some already merged PR's in your branch, so I re-pushed your commit onto a rebased main.

Do you have a folder in your working directory called datasets?

> @winglian I suggest you put a targeted speedup, on what qualifies for "optimized". Who knows, maybe `torch.compile` used the right way can generate your definition of "optimized" :) and...

Are you using a model from a checkpoint folder or the output folder?

> Using `transformers @ git+https://github.com/huggingface/transformers.git@3cefac1d974db5e2825a0cb2b842883a628be7a0` seems to work. @mgoulao is this a transformers regression then? That particular commit works with zero3 ?

I believe the problem is that the model's modules are all frozen and have `requires_grad` set to `False`. You can verify this with: ``` for name, param in model.named_parameters(recurse=True): print(f"{name}:...