maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

Deepstack Qwen3 Omni

Open eitanporat opened this issue 1 month ago • 0 comments

Description

Add deepstack support from this paper https://arxiv.org/abs/2406.04334 idea is to inject intermediate representations of the vision encoder into the intermediate layers of the llm.

Tests

I looked at the outputs at various steps and verified that the injection is happening correctly

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • [X] I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • [X] I have necessary comments in my code, particularly in hard-to-understand areas.
  • [X] I have run end-to-end tests tests and provided workload links above if applicable.
  • [X] I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

eitanporat avatar Nov 20 '25 17:11 eitanporat