Kaixiong Happy issues

Results 3 issues of


                                            Kaixiong Happy

Update tensor_parallel.py

Resolve the issue of abnormal conversation performance in the Baichuan large model. # Fix the bug in the norm_head adaptation for Baichuan. Fixes https://github.com/huggingface/text-generation-inference/issues/2780 https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/modeling_baichuan.py#:~:text=self.weight.data%20%3D%20nn.functional.normalize(self.weight) ![image](https://github.com/user-attachments/assets/76a821b6-e998-43d3-b0f6-ebc1f7614c00) @OlivierDehaene OR @Narsil

I encountered the same issue while using `baichuan2-13B-chat`..

I encountered the same issue while using `baichuan2-13B-chat`.. I extracted the chat parameters from baichuan2's [generation_config.json](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/generation_config.json), and when I call the tgi interface, the result is as follows. ![image](https://github.com/huggingface/text-generation-inference/assets/46644537/7e7f561c-f31e-43a0-8696-7ecd65fba9c5) When...

使用该工程，参考这个issues解决了上传无法成功的问题，是否考虑做出类似改动，新版本解决上传问题

https://github.com/houtianze/bypy/issues/741#issuecomment-3317038789