Vincenzo di Cicco

Results 5 comments of Vincenzo di Cicco

Sure, here are the steps to reproduce on a fresh lit-gpt clone: ```bash # download & convert model python scripts/download.py --repo_id TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T python scripts/convert_hf_checkpoint.py --checkpoint checkpoints/TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T/ # prepare (tiny) train...

Disabling `use_custom_all_reduce` solved the issue, many thanks! Does this means there is a bug in the custom all_reduce plugin, and if so: do you think this will be fixed? Furthermore,...

Hi @nv-guomingz disabling `use_custom_all_reduce` solved the reported issue with TRT-LLM 0.11.0 so this specific issue can be closed. As a last question: I saw that in recent versions of TRT-LLM...

@vonchenplus thanks for the answer. In my case I'm not using a _very_ low temperature, it is `0.1`

@PerkzZheng @Tracin thanks! I can confirm that disabling `use_fp8_context_fmha` solves the issue, If we will do tests on llama 7b I will update here. Regarding the decrease in quality of...