GUO-QING JIANG

Results 6 issues of GUO-QING JIANG

Maybe using pandas as csv tutorial will be better?

**When we tried to perfrom the Qs in the appendix of the llama paper, we found that it was just repeating... Anything needs to adjust? top_p? temperature?** **Q1: The sun...

Small LLMs trained using FP8 with 32 GPUs can achieve 20~30% speed up comparing with bf16. However, scaling up to 1000+ GPUs only achieve less than 5% speed up (TP2...

Is Medusa1 model generalize token-wise the same as the base model w.o. medusa head? I found change medusa choices will change the output.

I checked the TRT-LLM but found something confusing. There are some features not supported: 1. inferece batch size == 1, (seemed solved recently) 2. not surport in-flight batching, which will...

This setup can not pass UT. Could you please check it ?