transformers issues

LlamaAttention forward function type hint is incorrect #38739

6

Hi, this PR fixes a small issue in the LlamaAttention class. The return type in the forward method currently shows three values, but the function actually returns only two. This...

ArkVex

Add Bagel

2

Fixes #38267 Just a draft PR rn to build upon and iterate, discuss during the integration.

yaswanth19

GLM-4.1V Model support

1. This PR aims to support the use of the GLM-4-0414 model for training video understanding and image understanding models GLM-4.1V 2. This PR has completed the refactoring of the...

zRzRzRzRzRzRzR

Can not reproduce Blip2ForImageTextRetrieval example from docs, getting different results

2

### System Info - `transformers` version: 4.52.4 - Platform: Linux-4.4.0-x86_64-with-glibc2.36 - Python version: 3.12.6 - Huggingface_hub version: 0.32.3 - Safetensors version: 0.5.3 - Accelerate version: not installed - Accelerate config:...

KarlisJ

bug

Fix redundant code in Janus

1

Just fixes redundant code in modular file which has no affect on modelling file and the return statements felt a bit odd :sweat_smile: .

yaswanth19

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

2

### System Info Package Version Editable project location ------------------------- -------------- ------------------------- accelerate 1.7.0 aiohappyeyeballs 2.4.4 aiohttp 3.11.9 aiosignal 1.3.1 altair 5.5.0 annotated-types 0.7.0 anyio 4.6.2.post1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow...

qsuzer

bug

Does Qwen_2_5_VL support variable length attention computation?

8

### Feature request Qwen_2_5_VL support variable length attention computation ### Motivation Hello, I try to run qwen25_vl with packing samples, however, I found that it seems this function only passes...

yingtongxiong

Feature request

Updated aya_vision.md

# What does this PR do? As a part of this - https://github.com/huggingface/transformers/issues/36979 ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss...

1himan

Add Dia model

1

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

buttercrab

Exception while inference Qwen2VL and Qwen2VL, assert module.weight.shape[1] == 1

3

### System Info transformers version: 4.52.3 Platform: Linux-5.10.0-1029-oem-x86_64-with-glibc2.31 GPU device: Quadro RTX 8000 Python version: 3.10 Huggingface_hub version: 0.32.2 Safetensors version: 0.5.3 Accelerate version: 0.34.2 PyTorch version (GPU?): 2.5.0+cu124 Using...

iglaweb

bug

transformers
transformers copied to clipboard

Metadata

LlamaAttention forward function type hint is incorrect #38739

Add Bagel

GLM-4.1V Model support

Can not reproduce Blip2ForImageTextRetrieval example from docs, getting different results

Fix redundant code in Janus

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

Does Qwen_2_5_VL support variable length attention computation?

Updated aya_vision.md

Add Dia model

Exception while inference Qwen2VL and Qwen2VL, assert module.weight.shape[1] == 1

← Metadata

Owner

Metadata

transformers transformers copied to clipboard

Metadata

← Metadata

Owner

Metadata

transformers
transformers copied to clipboard