LocalAI
LocalAI copied to clipboard
chore(deps): bump transformers from 4.48.3 to 4.57.2 in /backend/python/coqui
Bumps transformers from 4.48.3 to 4.57.2.
Release notes
Sourced from transformers's releases.
Patch Release v4.57.2
This patch most notably fixes an issue on some Mistral tokenizers. It contains the following commits:
- Add AutoTokenizer mapping for mistral3 and ministral (#42198)
- Auto convert tekken.json (#42299)
- fix tekken pattern matching (#42363)
- Check model inputs - hidden states (#40994)
- Remove invalid
@staticmethodfrom module-level get_device_and_memory_breakdown (#41747)Patch release v4.57.1
This patch most notably fixes an issue with an optional dependency (
optax), which resulted in parsing errors withpoetry. It contains the following fixes:
- fix optax dep issue
- remove offload_state_dict from kwargs
- Fix bnb fsdp loading for pre-quantized checkpoint (#41415)
- Fix tests fsdp (#41422)
- Fix trainer for py3.9 (#41359)
v4.57.0: Qwen3-Next, Vault Gemma, Qwen3 VL, LongCat Flash, Flex OLMO, LFM2 VL, BLT, Qwen3 OMNI MoE, Parakeet, EdgeTAM, OLMO3
New model additions
Qwen3 Next
The Qwen3-Next series represents the Qwen team's next-generation foundation models, optimized for extreme context length and large-scale parameter efficiency. The series introduces a suite of architectural innovations designed to maximize performance while minimizing computational cost:
- Hybrid Attention: Replaces standard attention with the combination of Gated DeltaNet and Gated Attention, enabling efficient context modeling.
- High-Sparsity MoE: Achieves an extreme low activation ratio as 1:50 in MoE layers — drastically reducing FLOPs per token while preserving model capacity.
- Multi-Token Prediction(MTP): Boosts pretraining model performance, and accelerates inference.
- Other Optimizations: Includes techniques such as zero-centered and weight-decayed layernorm, Gated Attention, and other stabilizing enhancements for robust training.
Built on this architecture, they trained and open-sourced Qwen3-Next-80B-A3B — 80B total parameters, only 3B active — achieving extreme sparsity and efficiency.
Despite its ultra-efficiency, it outperforms Qwen3-32B on downstream tasks — while requiring less than 1/10 of the training cost. Moreover, it delivers over 10x higher inference throughput than Qwen3-32B when handling contexts longer than 32K tokens.
For more details, please visit their blog Qwen3-Next (blog post).
- Adding Support for Qwen3-Next by
@bozheng-hitin #40771Vault Gemma
VaultGemma is a text-only decoder model derived from Gemma 2, notably it drops the norms after the Attention and MLP blocks, and uses full attention for all layers instead of alternating between full attention and local sliding attention. VaultGemma is available as a pretrained model with 1B parameters that uses a 1024 token sequence length.
VaultGemma was trained from scratch with sequence-level differential privacy (DP). Its training data includes the same mixture as the Gemma 2 models, consisting of a number of documents of varying lengths. Additionally, it is trained using DP stochastic gradient descent (DP-SGD) and provides a (ε ≤ 2.0, δ ≤ 1.1e-10)-sequence-level DP guarantee, where a sequence consists of 1024 consecutive tokens extracted from heterogeneous data sources. Specifically, the privacy unit of the guarantee is for the sequences after sampling and packing of the mixture.
- add: differential privacy research model by
@RyanMullinsin #40851
... (truncated)
Commits
2915fb3Release v4.57.22a59904fix tekken pattern matching (#42363)7e66db7Auto convert tekken.json (#42299)311807fRemove invalid@staticmethodfrom module-level get_device_and_memory_breakd...804038fAdd AutoTokenizer mapping for mistral3 and ministral (#42198)ede92a8Check model inputs - hidden states (#40994)8cb5963Release: v4.57.1c6ae19eFix trainer for py3.9 (#41359)e0c6038Fix tests fsdp (#41422)2fbd25cFix bnb fsdp loading for pre-quantized checkpoint (#41415)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Deploy Preview for localai ready!
| Name | Link |
|---|---|
| Latest commit | 3d32197650ad7a2721a018eb40dafd80e474968a |
| Latest deploy log | https://app.netlify.com/projects/localai/deploys/6924aa19f413cf0008da726b |
| Deploy Preview | https://deploy-preview-7349--localai.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.
If you change your mind, just re-open this PR and I'll resolve any conflicts on it.