Joao Gante

Results 27 issues of Joao Gante

# What does this PR do? As discussed in https://github.com/huggingface/transformers/issues/18476 and https://github.com/huggingface/transformers/issues/18239, there are two problems while training DeBERTa v2 with TensorFlow: 1. `TFDebertaV2StableDropout` doesn't work at training time (actually,...

**Is your feature request related to a problem? Please describe.** We have `push_to_hub_keras` to push Keras models. However, HF transformer architectures have a more comprehensive model hub push in `save_pretrained`....

# What does this PR do? Adds the same [check that was recently added to TFBart](https://github.com/huggingface/transformers/blob/ba7f2173cc578fe6d9f1cdb900d5af609f195cf6/src/transformers/models/bart/modeling_tf_bart.py#L751), which asserts that the inputs are within the embedding input range, in all models...

### System Info - `transformers` version: 4.22.0.dev0 - Platform: Linux-5.15.0-33-generic-x86_64-with-glibc2.35 - Python version: 3.8.13 - Huggingface_hub version: 0.9.0 - PyTorch version (GPU?): 1.12.0+cu116 (True) - Tensorflow version (GPU?): 2.9.1 (True)...

bug

# What does this PR do? As the title describes: add a warning when left padding should be used. Incorrect use of right padding is detected when: 1. the model...

# What does this PR do? This PR adds the TF `compute_transition_scores`, akin to PT's #21191. What seemingly started off as a simple task, ended up being a complex task...

# Description Links in updated markdown open in a new tab :D See the original issue for more info Closes: #3234 NOTES/QUESTIONS: 1. Regarding testing, the instructions in CONTRIBUTING.md seem...

### Describe the bug EDIT -- PR #3236 is a proposal on how to fix the issue Normally, a link in a Markdown block opens in a new tab. However,...

bug

- [x] I have searched to see if a similar issue already exists. **Is your feature request related to a problem? Please describe.** A common feature related to models with...

enhancement

# What does this PR do? Trainer's `predict_with_generate` seems to have been designed for an older version of `.generate()`, where manual selection of the inputs was needed. The current version...