Kaixiong Happy

Results 3 issues of Kaixiong Happy

Resolve the issue of abnormal conversation performance in the Baichuan large model. # Fix the bug in the norm_head adaptation for Baichuan. Fixes https://github.com/huggingface/text-generation-inference/issues/2780 https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/modeling_baichuan.py#:~:text=self.weight.data%20%3D%20nn.functional.normalize(self.weight) ![image](https://github.com/user-attachments/assets/76a821b6-e998-43d3-b0f6-ebc1f7614c00) @OlivierDehaene OR @Narsil

I encountered the same issue while using `baichuan2-13B-chat`.. I extracted the chat parameters from baichuan2's [generation_config.json](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/generation_config.json), and when I call the tgi interface, the result is as follows. ![image](https://github.com/huggingface/text-generation-inference/assets/46644537/7e7f561c-f31e-43a0-8696-7ecd65fba9c5) When...