text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Large Language Model Text Generation Inference

Results 639 text-generation-inference issues
Sort by recently updated
recently updated
newest added

As title indicates I'd be interested in understanding whether this is just for text-generation or whether it could also be used to expose the embedding function?

feature request

Hi, I'm really new at trying out this stuff so perhaps I'm just missing something. I can't seem to get any models with .safetensor files to work, there's always an...

I'd like to put some effort into getting this to run on my RX6900XT, could you suggest what areas of this codebase I'd need to review/revise to get AMD /...

### System Info text-generation-inference: 0.6.0 Vicuna 13B Bottlerocket OS A10 GPU EKS ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported...

`Optional` is not a valid type because it needs type parameter # What does this PR do? Fix typing error due to Optional ## Before submitting - [ ] This...

# What does this PR do? This PR fixes typing error at `def batch_type` ## Before submitting - [ ] This PR fixes a typo or improves the docs (you...

### System Info Target: x86_64-unknown-linux-gnu Cargo version: 1.69.0 Commit sha: 22c4fd07abe9d499cd8eda807f389084773124bd Docker label: sha-22c4fd0 nvidia-smi: Mon May 15 09:02:10 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8...

### Feature request I'd like to run this on CPU ### Motivation Proof of concept ### Your contribution not sure if I'm doing something wrong or if the codebase is...

Similar to `bad_words_ids` in transformers it would be useful to be able to pass a set of tokens that are never sampled during generation.

feature request

### Feature request Can this API support Assisted Generation feature from https://huggingface.co/blog/assisted-generation? ### Motivation Better Latency. ### Your contribution Yes..