transformers
transformers copied to clipboard
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
### System Info Using MacOS Big Sur v11.6.4 and jupyter lab v3.4.7. ### Who can help? _No response_ ### Information - [ ] The official example scripts - [x] My...
### Feature request Currently, when we use the [save_pretrained](https://github.com/huggingface/transformers/blob/ca485e562b675341409e3e27724072fb11e10af7/src/transformers/modeling_tf_utils.py#L2085) function from this library the model signature used to save the model is the default one that only calls the model...
### Feature request Hi there! Is there scope for a BPE (SentencePiece) CTC tokenizer? Using a trained SentencePiece vocabulary in a CTC model is pretty straightforward - all we need...
### Feature request Activation checkpointing is implemented for the PyTorch GPT2 model (and different head variants), however, this is not the case for the Tensorflow implementation of GPT2, ### Motivation...
# What does this PR do? support auto-compress for glue task
### Feature request HfArgumentParser now supports for parsing dict and json files, will it be possible to support for parsing the widely used yaml files? ### Motivation I think using...
This is a rebase and rework of the ESM PR at #13662. The old PR predates the `master -> main` rename and the conversion from `.rst` to `.mdx` documentation. As...
# Motivation Add `bnb` support for ViLT model as it has been asked by a user in https://github.com/TimDettmers/bitsandbytes/issues/14 . This involved adding `accelerate` support for this model. # What does...
# What does this PR do? Adds the [MSN](https://arxiv.org/abs/2204.07141) checkpoints for ViT. MSN shines in the few-shot regimes which would benefit real-world use cases. Later we could add a pre-training...
- [ ] This PR Fix None loss in docstring for Wav2Vec2ForPreTraining