Request for Training Code and Feature Fusion Details in EAGLE-3

Open junghye01 opened this issue 8 months ago • 1 comments

Hi, I have a few questions and requests related to EAGLE-3 training and feature fusion implementation.

Could you share the data generation code used to build the training dataset for EAGLE-3?
Regarding feature fusion, the paper mentions using low,mid, and high-level hidden states from the decoder.

I assume the high-level feature refers to the final decoder layer (right before the LM head).
Could you clarify which specific decoder layers are used for the low and mid-level features?

It seems the current train/main.py does not include the updated loss function for EAGLE-3. Would it be possible to share the full training script or main.py used for training EAGLE-3?

Thanks,

Apr 24 '25 05:04 junghye01

for idx, decoder_layer in enumerate(self.layers): if idx==len(self.layers)-3 or idx==len(self.layers)//2 or idx==2: all_hidden_states += (hidden_states,)

EAGLE/eagle/model /modeling_llama_kv.py: line 1138

Apr 25 '25 07:04 hongdaxia