hongyinrong comments

Results 5 comments of


                                            hongyinrong

Clarification on d2t and t2d Mapping Logic in EAGLE-3 Draft Model

The Vicuna-13B-v1.3 model has a smaller vocabulary, and its head overhead is relatively small. Models like LLaMA3.1-Instruct 8B, LLaMA3.3-Instruct 70B, and DeepSeek-R1-Distill-LLaMA 8B have larger vocabularies, resulting in greater head...

Request for Training Code and Feature Fusion Details in EAGLE-3

for idx, decoder_layer in enumerate(self.layers): if idx==len(self.layers)-3 or idx==len(self.layers)//2 or idx==2: all_hidden_states += (hidden_states,) EAGLE/eagle/model /modeling_llama_kv.py: line 1138

Can EAGLE-3 be used across different machines?

Theoretically, EAGLE-3 can accomplish all of these tasks, but its acceleration performance might not be optimal. We have conducted further work based on EAGLE-3. By employing mathematical modeling and considering...

Can EAGLE-3 be used across different machines?

> Thank you for the response. If I were to experiment with the setups I mentioned, which method of doing so would you recommend? > > Also, the work you...

Can EAGLE-3 be used across different machines?

> > > Thank you for the response. If I were to experiment with the setups I mentioned, which method of doing so would you recommend? > > > Also,...