chore: bump transformers from 4.37.2 to 4.40.2 in /presets/inference/text-generation
Bumps transformers from 4.37.2 to 4.40.2.
Release notes
Sourced from transformers's releases.
v4.40.2
Fix torch fx for LLama model
Thanks
@michaelbenayoun!v4.40.1: fix
EosTokenCriteriaforLlama3onmpsKudos to
@pcuencafor the prompt fix in:
- Make EosTokenCriteria compatible with mps #30376
To support
EosTokenCriteriaon MPS whilepytorchadds this functionality.v4.40.0: Llama 3, Idefics 2, Recurrent Gemma, Jamba, DBRX, OLMo, Qwen2MoE, Grounding Dino
New model additions
Llama 3
Llama 3 is supported in this release through the Llama 2 architecture and some fixes in the
tokenizerslibrary.Idefics2
The Idefics2 model was created by the Hugging Face M4 team and authored by Léo Tronchon, Hugo Laurencon, Victor Sanh. The accompanying blog post can be found here.
Idefics2 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces text outputs. The model can answer questions about images, describe visual content, create stories grounded on multiple images, or simply behave as a pure language model without visual inputs. It improves upon IDEFICS-1, notably on document understanding, OCR, or visual reasoning. Idefics2 is lightweight (8 billion parameters) and treats images in their native aspect ratio and resolution, which allows for varying inference efficiency.
- Add Idefics2 by
@amyerobertsin #30253Recurrent Gemma
Recurrent Gemma architecture. Taken from the original paper.
The Recurrent Gemma model was proposed in RecurrentGemma: Moving Past Transformers for Efficient Open Language Models by the Griffin, RLHF and Gemma Teams of Google.
The abstract from the paper is the following:
We introduce RecurrentGemma, an open language model which uses Google’s novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned variant. Both models achieve comparable performance to Gemma-2B despite being trained on fewer tokens.
- Add recurrent gemma by
@ArthurZuckerin #30143Jamba
Jamba is a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and an overall of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
... (truncated)
Commits
4fdf58av4.40.26530a98Fix copies for DBRX - neuron fix (#30610)bb98e7cFix for Neuron (#30259)9fe3f58v4.40.1f8fec6bMake EosTokenCriteria compatible with mps (#30376)745bbfeRelease: v4.40.05728b5aFIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + rever...005b957Add DBRX Model (#29921)63c5e27Do not drop mask with SDPA for more cases (#30311)acab997Revert "Re-enable SDPA's FA2 path (#30070)" (#30314)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)