SpecForge icon indicating copy to clipboard operation
SpecForge copied to clipboard

Fix for when draft model hidden dimension is different from target model hidden dimension

Open yilian49 opened this issue 4 months ago • 2 comments

Motivation

The current draft model files (e.g. llama3_eagle.py) supports draft model config having target_hidden_dimension separate from hidden_dimension. However, in the training scripts, it loads the target model's embedding function by default, which wouldn't work if you have different hidden dimensions between target and draft model.

Modifications

Add a check for if there is a target_hidden_dimension in the draft config and if the two hidden dimensions are different. Don't freeze the embedding function if it doesn't load embedding function from target model.

yilian49 avatar Aug 26 '25 15:08 yilian49

this means that the code in inference engine(e.g sglang) also needs to change as well, right?

zyksir avatar Sep 09 '25 09:09 zyksir

Hi, wonder if during inference, this means that a EAGLE head of a different hidden dim can be inferred with a base model with a different hidden dim, or if they are fundamentally incompatible.

@yilian49

b8zhong avatar Oct 08 '25 05:10 b8zhong