diffusers issues

Add Newbie Image support

# What does this PR do? Adds NewbieAI support to Diffusers. Adds `pooled_projection_dim` config to Lumina2Transformer2DModel and uses pooled projections from Newbie codebase if it is set to something other...

Disty0

Add NewbiePipeline and NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT_CLIP transformer

3

This PR introduces a new text-to-image pipeline named **NewbiePipeline**, as well as a new NextDiT-based transformer architecture, **NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT_CLIP**, fully implemented following Diffusers' pipeline and model design principles. ### 🚀 Main...

E-Anlia

[feat] LongSANA: a minute-length real-time video generation model

6

This PR supports LongSANA: a minute-length real-time video generation model ## Related links: project: https://nvlabs.github.io/Sana/Video code: https://github.com/NVlabs/Sana paper: https://arxiv.org/pdf/2509.24695 ## PR feature: LongSANA uses Causal Linear Attention KV Cache during...

lawrence-cj

`from_pipe` converts pipelines to float32 by default

10

### Describe the bug Pipelines passed to `from_pipe()` are converted to float32 unless `torch_dtype` is specified, leading to higher memory usage and slower inference. ### Reproduction ```python import torch from...

missionfloyd

bug

Add Wan2.2-S2V: Audio-Driven Cinematic Video Generation

28

This PR is fixing #12257. Comparison with the original repo When I put `with torch.amp.autocast('cuda', dtype=torch.bfloat16):` onto the transformer only and converted the initial noise's `dtype` into `torch.float32` from `torch.bfloat16`...

tolgacangoz

Fix Context Parallelism doc

1

# What does this PR do? Fix error in Context Parallelism doc ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the...

a120092009

Fix meta tensor error with bitsandbytes quantization and device_map

## What does this PR do? Fixes #12719 This PR fixes a critical issue where using bitsandbytes quantization with `device_map='balanced'` (or other device_map strategies) on transformers models within diffusers pipelines...

arrdel

Complex numbers in Qwen/Z-Image Image pipeline incompatible with torch.compile

10

### Describe the bug Note: This might be something for the MVP program https://github.com/huggingface/diffusers/issues/12635 if there's anyone who already has a deep understanding of rotary embeddings and complex numbers. I...

dxqb

bug

LTX Video 0.9.8 long multi prompt

5

PR: Add LTXI2VLongMultiPromptPipeline (ComfyUI-parity long I2V with multi-prompt sliding windows) What does this PR do? - Introduces a new pipeline LTXI2VLongMultiPromptPipeline providing long-duration image-to-video generation using temporal sliding windows with...

yaoqih

diffusers-mvp

FlowMatch schedulers - closing the gap

11

As stated [here](https://github.com/huggingface/diffusers/issues/9490#issuecomment-2369756363), lets close the scheduler gap! Problem statement: - Most (if not all quality ones) new models are DiT based - Implementation of DiT based models in Diffusers...

vladmandic

wip

scheduler

diffusers
diffusers copied to clipboard

Metadata

Add Newbie Image support

Add NewbiePipeline and NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT_CLIP transformer

[feat] LongSANA: a minute-length real-time video generation model

`from_pipe` converts pipelines to float32 by default

Add Wan2.2-S2V: Audio-Driven Cinematic Video Generation

Fix Context Parallelism doc

Fix meta tensor error with bitsandbytes quantization and device_map

Complex numbers in Qwen/Z-Image Image pipeline incompatible with torch.compile

LTX Video 0.9.8 long multi prompt

FlowMatch schedulers - closing the gap

← Metadata

Owner

Metadata

diffusers diffusers copied to clipboard

Metadata

← Metadata

Owner

Metadata

diffusers
diffusers copied to clipboard