text-generation-inference
text-generation-inference copied to clipboard
Weird behavior when using the Helsinki-NLP/opus-mt-en-ar
System Info
Model being used: Helsinki-NLP/opus-mt-en-ar Hardware used: A100 Deployment specificities: Deployed using TGI and pinging the model with the help of the InferenceClient class from huggingface_hub
Information
- [ ] Docker
- [ ] The CLI directly
Tasks
- [ ] An officially supported command
- [ ] My own modifications
Reproduction
from huggingface_hub import InferenceClient
client = InferenceClient(f"http://{url}:{port}")
results = client.text_generation(
">>ara<< We will be going tomorrow ",
)
print(results)
Expected behavior
I deployed the model using TGI with the following command:
docker run --name en_ar_translation --gpus device=0 --shm-size 1g -p 1111:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.4 --model-id $model --trust-remote-code --max-input-length 20 --max-total-tokens 128
The docker container went up, but whenever I ping the model, it is returning a random word repeated over and over. When I deploy the model or test it using the Pipeline library, the model performs correctly and everything is good.
I know the architecture is not supported by TGI, but according to the documentation, I should be unable to shard the model and to use Flash Attention, would it also cause such a huge performance drop?
I just noticed that all sort of summarization tasks do not work with TGI. You are unable to deploy a summarization model using TGI. If you do deploy a summarization model using TGI, the output will basically be gibberish. To solve it, you would need to do the following:
client = InferenceClient(model="http://0.0.0.0:8000") client.summarization( text="TEST" model="facebook/bart-large-cnn")
You would need to pass the model that you want to use. This downloads the model locally rendering the container that was deployed using TGI uselss.
Is summarization not supported by TGI?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.