text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

How to serve local models with python package (not docker)

Open hahmad2008 opened this issue 1 year ago • 2 comments
trafficstars

System Info

pip install text-generation with version '0.6.0' I need to use python package not docker

Information

  • [ ] Docker
  • [ ] The CLI directly

Tasks

  • [ ] An officially supported command
  • [ ] My own modifications

Reproduction

from text_generation import Client

# Initialize the client
client = Client("/path/to/model/locally")

# Generate text
response = client.generate("Your input text here")

error:

MissingSchema: Invalid URL '/path/to/model/locally': No scheme supplied. Perhaps you meant [/path/to/model/locally](/path/to/model/locally?

also I tried this as with some models also on huggingface and local models doesn't work!

from text_generation import InferenceAPIClient
client = InferenceAPIClient("NousResearch/Meta-Llama-3.1-8B-Instruct")
text = client.generate("Why is the sky blue?").generated_text
print(text)
# ' Rayleigh scattering'

# Token Streaming
text = ""
for response in client.generate_stream("Why is the sky blue?"):
    if not response.token.special:
        text += response.token.text

print(text)

error:

NotSupportedError: Model `NousResearch/Meta-Llama-3.1-8B-Instruct` is not available for inference with this client. 
Use `huggingface_hub.inference_api.InferenceApi` instead.

Expected behavior

  • I can load any model ( local or form HF hub)

hahmad2008 avatar Sep 20 '24 21:09 hahmad2008

There is no pip installable package for TGI at the moment. Use vLLM if that is what you need.

nbroad1881 avatar Sep 23 '24 23:09 nbroad1881

@nbroad1881 If I install TGI from the source, then can I run it for local or ay HF models?

hahmad2008 avatar Sep 26 '24 06:09 hahmad2008