shapiq
shapiq copied to clipboard
Bump transformers from 4.44.2 to 4.45.1
Bumps transformers from 4.44.2 to 4.45.1.
Release notes
Sourced from transformers's releases.
Patch Release v4.45.1
Patches for v4.45.1
- [MllamaProcessor] Update errors and API with multiple image (#33715) by
@ArthurZucker
- Generate: can_generate() recursive check (#33718) by
@gante
- clean_up_tokenization_spaces=False if unset (#31938) by
@itazap
Llama 3.2, mllama, Qwen2-Audio, Qwen2-VL, OLMoE, Llava Onevision, Pixtral, FalconMamba, Modular Transformers
New model additions
mllama
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks.
- Add MLLama #33703, by
@qubvel
,@zucchini-nlp
,@ArthurZucker
Qwen2-VL
The Qwen2-VL is a major update from the previous Qwen-VL by the Qwen team.
An extract from the Qwen2-VL blogpost available here is as follows:
Qwen2-VL is the latest version of the vision language models based on Qwen2 in the Qwen model familities. Compared with Qwen-VL, Qwen2-VL has the capabilities of:
- SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc.
- Understanding videos of 20min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc.
- Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions.
- Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
- support qwen2-vl by
@simonJJJ
in #32318Qwen2-Audio
The Qwen2-Audio is the new model series of large audio-language models from the Qwen team. Qwen2-Audio is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions.
They introduce two distinct audio interaction modes:
- voice chat: users can freely engage in voice interactions with Qwen2-Audio without text input
- audio analysis: users could provide audio and text instructions for analysis during the interaction
OLMoE
OLMoE is a series of Open Language Models using sparse Mixture-of-Experts designed to enable the science of language models. The team releases all code, checkpoints, logs, and details involved in training these models.
... (truncated)
Commits
e71a01a
manually fix PLBart tokenizer0317895
v4.45.14ea1c43
clean_up_tokenization_spaces=False if unset (#31938)289edd9
Generate:can_generate()
recursive check (#33718)c64be31
[MllamaProcessor
] Update errors and API with multiple image (#33715)2ef31de
Release: v4.45.019d58d3
Add MLLama (#33703)94f18cf
Add OmDet-Turbo (#31843)ade9e0f
Corrected max number for bf16 in transformer/docs (#33658)196d35c
Add AdEMAMix optimizer (#33682)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebase
will rebase this PR -
@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it -
@dependabot merge
will merge this PR after your CI passes on it -
@dependabot squash and merge
will squash and merge this PR after your CI passes on it -
@dependabot cancel merge
will cancel a previously requested merge and block automerging -
@dependabot reopen
will reopen this PR if it is closed -
@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency -
@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)