text-generation-inference
text-generation-inference copied to clipboard
How to serve local models with python package (not docker)
trafficstars
System Info
pip install text-generation with version '0.6.0'
I need to use python package not docker
Information
- [ ] Docker
- [ ] The CLI directly
Tasks
- [ ] An officially supported command
- [ ] My own modifications
Reproduction
from text_generation import Client
# Initialize the client
client = Client("/path/to/model/locally")
# Generate text
response = client.generate("Your input text here")
error:
MissingSchema: Invalid URL '/path/to/model/locally': No scheme supplied. Perhaps you meant [/path/to/model/locally](/path/to/model/locally?
also I tried this as with some models also on huggingface and local models doesn't work!
from text_generation import InferenceAPIClient
client = InferenceAPIClient("NousResearch/Meta-Llama-3.1-8B-Instruct")
text = client.generate("Why is the sky blue?").generated_text
print(text)
# ' Rayleigh scattering'
# Token Streaming
text = ""
for response in client.generate_stream("Why is the sky blue?"):
if not response.token.special:
text += response.token.text
print(text)
error:
NotSupportedError: Model `NousResearch/Meta-Llama-3.1-8B-Instruct` is not available for inference with this client.
Use `huggingface_hub.inference_api.InferenceApi` instead.
Expected behavior
- I can load any model ( local or form HF hub)
There is no pip installable package for TGI at the moment. Use vLLM if that is what you need.
@nbroad1881 If I install TGI from the source, then can I run it for local or ay HF models?