kairon
kairon copied to clipboard
Bump transformers from 4.5.0 to 4.21.1
Bumps transformers from 4.5.0 to 4.21.1.
Release notes
Sourced from transformers's releases.
# v4.21.1: Patch release
Fix a regression in Trainer checkpoint loading: #18470
v4.21.0: TF XLA text generation - Custom Pipelines - OwlViT, NLLB, MobileViT, Nezha, GroupViT, MVP, CodeGen, UL2
TensorFlow XLA Text Generation
The TensorFlow text generation method can now be wrapped with
tf.function
and compiled to XLA. You should be able to achieve up to 100x speedup this way. See our blog post and our benchmarks. You can also see XLA generation in action in our example notebooks, particularly for summarization and translation.import tensorflow as tf from transformers import AutoTokenizer, TFAutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("t5-small") model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-small")
Main changes with respect to the original generate workflow:
tf.function
andpad_to_multiple_of
xla_generate = tf.function(model.generate, jit_compile=True) tokenization_kwargs = {"pad_to_multiple_of": 32, "padding": True, "return_tensors": "tf"}
The first prompt will be slow (compiling), the others will be very fast!
input_prompts = [ f"translate English to {language}: I have four cats and three dogs." for language in ["German", "French", "Romanian"] ] for input_prompt in input_prompts: tokenized_inputs = tokenizer([input_prompt], **tokenization_kwargs) generated_text = xla_generate(**tokenized_inputs, max_new_tokens=32) print(tokenizer.decode(generated_text[0], skip_special_tokens=True))
- Generate: deprecate default
max_length
by@gante
in #18018- TF: GPT-J compatible with XLA generation by
@gante
in #17986- TF: T5 can now handle a padded past (i.e. XLA generation) by
@gante
in #17969- TF: XLA beam search + most generation-compatible models are now also XLA-generate-compatible by
@gante
in #17857- TF: generate without
tf.TensorArray
by@gante
in #17801- TF: BART compatible with XLA generation by
@gante
in #17479New model additions
OwlViT
The OWL-ViT model (short for Vision Transformer for Open-World Localization) was proposed in Simple Open-Vocabulary Object Detection with Vision Transformers by Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby. OWL-ViT is an open-vocabulary object detection network trained on a variety of (image, text) pairs. It can be used to query an image with one or multiple text queries to search for and detect target objects described in text.
- Add OWL-ViT model for zero-shot object detection by
@alaradirik
in #17938- Fix OwlViT tests by
@sgugger
in #18253NLLB
The NLLB model was presented in No Language Left Behind: Scaling Human-Centered Machine Translation by Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, and Jeff Wang. No Language Left Behind (NLLB) is a model capable of delivering high-quality translations directly between any pair of 200+ languages — including low-resource languages like Asturian, Luganda, Urdu and more.
... (truncated)
Commits
f0d4968
Patch release: v4.21.1dea58d6
Fix load of model checkpoints in the Trainer (#18470)a9eee2f
Release: v4.21.00daa202
Fix sacremoses sof dependency for Transofmers XL31b3a12
sentencepiece shouldn't be required for the fast LayoutXLM tokenizer3496ea8
Remove all uses of six (#18318)9e564d0
fix loading from pretrained for sharded model with `torch_dtype="auto" (#18061)36f9859
[EncoderDecoder] Improve docs (#18271)3c45faa
[DETR] Improve code examples (#18262)ee67e7a
patch for smddp import (#18244)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebase
will rebase this PR -
@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it -
@dependabot merge
will merge this PR after your CI passes on it -
@dependabot squash and merge
will squash and merge this PR after your CI passes on it -
@dependabot cancel merge
will cancel a previously requested merge and block automerging -
@dependabot reopen
will reopen this PR if it is closed -
@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)