Sylvain Gugger comments

Results 633 comments of


                                            Sylvain Gugger

Fluent API for training arguments

While I understand the idea of grouping related arguments together, the proposed approach is very functional, which is not something we use anywhere in the Transformers library. So this API...

Fluent API for training arguments

Hi @apohllo Sorry for the delay on this. Would something like in the PR linked above work for you?

Lazy loading models on systems with more VRAM than RAM

Could you please share a snippet of code that fails on such an env with `device_map="auto"` sent to `from_pretrained`? This loads the model directly on the GPU (as long as...

Lazy loading models on systems with more VRAM than RAM

I think you are missing a `torch_dtype=torch.float16` or `torch_dtype=torch.bfloat16` to get to 12GB of use. Otherwise the model will need 24GB of memory if it has 6b parameters (the default...

Lazy loading models on systems with more VRAM than RAM

Can you try to see if adding a layer of garbage collector helps? ```py import gc gc.collect() ``` There is no reason for the CPU RAM to be used once...

Lazy loading models on systems with more VRAM than RAM

Mmm, diving into the reproducer @muellerzr, it looks like memory is not released by PyTorch when moving the model to a device: ``` import psutil, torch from transformers import AutoModelForCausalLM...

Fix typing about next_beam_tokens and next_beam_indices

Please make sure to run `make style` on your branch so that the quality tests pass. cc @gante for review.

Fix typing about next_beam_tokens and next_beam_indices

Please have each of your PR focused on one thing. We don't want to group changes that are not linked to each other in the same PR :-)

Add ConvNeXt-V2 Model

It does look like the model code is exactly the same at a first glance (saw everything is copied from ConvNext). If that is the case, yes to re-using the...

🌐 [i18n-KO] Translating docs to Korean

The main commit message is the title of the PR.