Sylvain Gugger comments

Results 633 comments of


                                            Sylvain Gugger

Add missing lang tokens in M2M100Tokenizer.get_vocab

Re-ping of @ArthurZucker

[TENTATIVe] Attempt to reduce number of HEAD calls during model warmup.

There is only one call to head now once the model is cached @Narsil

Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode

Thanks again for all your work on this!

Socket Timeout when using DDP

You are not using the [`ddp_timeout`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments.ddp_timeout) training argument to put a higher value than 30 minutes, so if you have a big dataset to preprocess, you get this error. Use...

Socket Timeout when using DDP

If you use `torch.distributed.launch` with a `ddp_timeout` that is not listened to, it sounds like a bug in PyTorch ;-)

Documentation example error for Train a TensorFlow model with Keras

cc @Rocketknight1

FastTokenizer for LLaMa

Let's maybe wait for the LLaMa PR to be merged first?

could we use load_checkpoint_and_dispatch in a deepspeed framework?

`load_checkpoint_and_dispatch` is intended for naive model parallelism and not compatible with DeepSpeed.

Cannot export Deberta to TorchScript

Yes, this model is not compatible with torchscript, cc @ArthurZucker

input_ids_seq_length is always 1

cc @gante