text-generation-inference
                                
                                 text-generation-inference copied to clipboard
                                
                                    text-generation-inference copied to clipboard
                            
                            
                            
                        Large Language Model Text Generation Inference
### System Info I tried to serve llama3.1-8b using TGI on A10 (24G) on context length 4k. coomand: ``` docker run --gpus all -it --rm -p 8000:80 ghcr.io/huggingface/text-generation-inference:3.0.0 --model-id NousResearch/Meta-Llama-3.1-8B-Instruct...
# What does this PR do? This PR installs the `text-generation-server` Python requirements from an exported `requirements.txt`-like file generated out of the `poetry.lock`, to be able to reuse the generated...
Resolve the issue of abnormal conversation performance in the Baichuan large model. # Fix the bug in the norm_head adaptation for Baichuan. Fixes https://github.com/huggingface/text-generation-inference/issues/2780 https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat/blob/main/modeling_baichuan.py#:~:text=self.weight.data%20%3D%20nn.functional.normalize(self.weight)  @OlivierDehaene OR @Narsil
### System Info none ### Information - [ ] Docker - [ ] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own...
### System Info latest docker pull, --version says: `text-generation-launcher 3.0.0` model used: https://huggingface.co/AI-Safeguard/Ivy-VL-llava ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially...
# What does this PR do? Fixes #2641 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if...
### System Info System: `Linux 4.18.0-553.22.1.el8_10.x86_64 #1 SMP Wed Sep 25 09:20:43 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux` `Rocky Linux 8.10` Model: `mistralai/Mistral-Nemo-Instruct-2407` Hardware: * GPU: `NVIDIA A100-SXM4-80GB` * CPU:...
### System Info **Platform:** Dell 760xa with 4x L40S GPUs **OS Description:** Ubuntu 22.04.5 LTS **GPU:** NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 **Python:** 3.10.12 **Docker:** 26.1.5 **Model:** [Deploy...
### System Info docker version: ghcr.io/huggingface/text-generation-inference:sha-d2ed52f model: Qwen2.5-1.5B-Instruct (tested on Qwen2.5-32B-Instruct as well) ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially...