api-inference-community icon indicating copy to clipboard operation
api-inference-community copied to clipboard

[Startup Plan]: Failed to launch GPU inference

Open Matthieu-Tinycoaching opened this issue 4 years ago • 3 comments

Hi community,

I have subscribed a 7-day free trial of the Startup Plan and I wish to test GPU inference API on this model: https://huggingface.co/Matthieu/stsb-xlm-r-multilingual-custom

However, when using the below code:

import json
import requests

API_URL = "https://api-inference.huggingface.co/models/Matthieu/stsb-xlm-r-multilingual-custom"
headers = {"Authorization": "Bearer API_ORG_TOKEN"}

def query(payload):
    data = json.dumps(payload)
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))

payload1 = {"inputs": "Navigateur Web : Ce logiciel permet d'accéder à des pages web depuis votre ordinateur. Il en existe plusieurs téléchargeables gratuitement comme Google Chrome ou Mozilla. Certains sont même déjà installés comme Safari sur Mac OS et Edge sur Microsoft.", "options": {"use_cache": False, "use_gpu": True}}

sentence_embeddings1 = query(payload1)
print(sentence_embeddings1)

I got the following error: {'error': 'Model Matthieu/stsb-xlm-r-multilingual-custom is currently loading', 'estimated_time': 44.490336920000004}

Do I have to wait some time until the model is loaded for GPU inference?

Thanks!

Matthieu-Tinycoaching avatar Jun 09 '21 11:06 Matthieu-Tinycoaching

Maybe of interest to @Narsil

LysandreJik avatar Jun 09 '21 15:06 LysandreJik

Hi @Matthieu-Tinycoaching This is linked to: huggingface/api-inference-community#26

Community images do not implement:

  • private models
  • GPU inference
  • Acceleration

So what you are seeing is quite normal and is expected. If you don't mind we should keep the discussion over there as all 3 are correlated.

Narsil avatar Jun 09 '21 15:06 Narsil

Hi @Narsil thanks for the feedback.

However I don't understand so how I can test accelerated inference (CPU+GPU) API on my custom public model?

What is testable so on accelerated inference API and what should I benefit from the free trial startup plan from?

Matthieu-Tinycoaching avatar Jun 09 '21 16:06 Matthieu-Tinycoaching