diffusers Add NewbiePipeline and NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT

This PR introduces a new text-to-image pipeline named NewbiePipeline, as well as a new NextDiT-based transformer architecture, NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT_CLIP, fully implemented following Diffusers' pipeline and model design principles.

🚀 Main additions

• New pipeline Adds NewbiePipeline under diffusers.pipelines.newbie/.
The pipeline follows the standard Diffusers structure (DiffusionPipeline subclass) and supports loading via from_pretrained.

• New transformer architecture Adds transformer_newbie.py, implementing:

NextDiT backbone with grouped-query attention (GQA)
Adaln-Refiner blocks
Patch-size 2 vision encoder
36 transformer layers
2304 hidden dims
WHIT CLIP–style text conditioning

The transformer inherits from ModelMixin, enabling standard save/load, weight serialization and integration with Diffusers utilities.

• RMSNorm implementation Adds RMSNorm to diffusers.models.components, using a PyTorch fallback and supporting Apex fused RMSNorm if available.

• Scheduler compatibility The pipeline is compatible with FlowMatchEulerDiscreteScheduler without requiring additional custom scheduler code.

🧩 Motivation

This PR provides an implementation of a modern NextDiT-style text-to-image architecture with high-resolution capability and strong conditioning support.
The goal is to enable researchers and users to load, run, and fine-tune this model directly through Diffusers with minimal friction.

📁 Files added

src/diffusers/models/components.py src/diffusers/models/transformers/transformer_newbie.py src/diffusers/pipelines/newbie/pipeline_newbie.py src/diffusers/pipelines/newbie/init.py

shell Copy code

📁 Files modified

src/diffusers/init.py src/diffusers/models/init.py src/diffusers/models/transformers/init.py src/diffusers/pipelines/init.py

yaml Copy code

✔ Notes

No external dependencies required
Apex is optional; PyTorch RMSNorm is the default path
The pipeline has been tested locally with from_pretrained and produces expected outputs
Follows the established structure of Diffusers pipelines & transformer modules

Fixes # (no issue linked)

Before submitting

[x] I have read the contributor guidelines
[x] This PR introduces a new pipeline and model
[x] All necessary registration points are updated
[x] The implementation is consistent with existing Diffusers conventions

Who can review?

Tagging pipeline & transformer reviewers:
@asomoza @yiyixuxu @sayakpaul

Dec 04 '25 05:12 E-Anlia

Can you link the original codebase, paper, and some results of this model?

Dec 04 '25 11:12 sayakpaul

https://huggingface.co/NewBie-AI/NewBie-image-Exp0.1 https://github.com/[NewBieAI-Lab/NewBie-image-Exp0.1 NewBie_image_Exp0 1_Training This model is based on improvements made to research on lumina. Based on NextDiT Example：

Dec 05 '25 03:12 E-Anlia

Thanks for your work!

The PR https://github.com/huggingface/diffusers/pull/12803 is in a better place to be merged. Could you try to collaborate on that PR, instead?

Dec 08 '25 04:12 sayakpaul

Add NewbiePipeline and NextDiT_3B_GQA_patch2_Adaln_Refiner_WHIT_CLIP transformer

🚀 Main additions

🧩 Motivation

📁 Files added

📁 Files modified

✔ Notes

Before submitting

Who can review?