1096125073

Results 8 comments of 1096125073

Hi, Telechat2 is a language model independently developed by China Telecom AI Company. In order to facilitate users to use the superior awq algorithm, this PR was raised.

https://huggingface.co/Tele-AI/TeleChat2-7B-32K

Is there any way to ensure that the engine generated by build is identical?This is important for engineering deployment.

i have disable custom_all_reduce when build engine

> Hi @1096125073 , since different batch sizes may lead to different kernels. So, the results can be different. This is a known issue. Thank you for your answer! I'm...

> @1096125073 Yes, I get your point: repeat the same input prompt 4 times, and make it a batch, but the outputs are different from batch size 1. Unfortunately, it's...

> @1096125073 Do you use multiple GPUs? If you use multi-GPU, you can use NCCL_ALGO=Tree to ensure stable reduce order. NCCL usually select Ring algo, which has unstable reduce order,...