text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Large Language Model Text Generation Inference

Results 639 text-generation-inference issues
Sort by recently updated
recently updated
newest added

### System Info i was trying to run CohereForAI/c4ai-command-r-v01 with these commands model= CohereForAI/c4ai-command-r-v01 volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run docker...

### System Info Docker image: TGI 1.4.5 Target: x86_64-unknown-linux-gnu Cargo version: 1.75.0 Commit sha: 4ee0a0c4010b6e000f176977648aa1749339e8cb Docker label: sha-4ee0a0c nvidia-smi: N/A ### Information - [X] Docker - [ ] The CLI...

i want to use in python but it so slow, i want to use quickly

### Feature request The fp6 quant announced by deepspeed ### Motivation Lower latency, higher throughout, less memory ### Your contribution https://github.com/usyd-fsalab/fp6_llm/blob/main/tests/python/kernel_test.py Seems to be a low hanging fruit Can open...

Stale

### Feature request This is just a question (if no, then it is a request): Does Exllama V2 support have continuous batching? That's the only thing I find missing in...

Stale

### System Info I noticed text-generation-inference is providing incorrect log-probabilities in the details whenever top-p sampling is used. I'm running text-generation-inference via docker, using this k8s pod definition on a...

Stale

### System Info version latest, sha-7dbaf9e ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own...

Stale

### System Info Docker Image: ghcr.io/huggingface/text-generation-inference:sha-1734540 Instance: AWS A10G via Huggingface Interfence Endpoint ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially...

I'm using TGI with Flan-T5 to process thousands of text extraction requests at a time, on a 4 x A6000 machine. My client class, which uses `AsyncInferenceClient`, can handle 900...