api-inference-community [Startup Plan]: Failed to launch GPU inference

Hi community,

I have subscribed a 7-day free trial of the Startup Plan and I wish to test GPU inference API on this model: https://huggingface.co/Matthieu/stsb-xlm-r-multilingual-custom

However, when using the below code:

import json
import requests

API_URL = "https://api-inference.huggingface.co/models/Matthieu/stsb-xlm-r-multilingual-custom"
headers = {"Authorization": "Bearer API_ORG_TOKEN"}

def query(payload):
    data = json.dumps(payload)
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))

payload1 = {"inputs": "Navigateur Web : Ce logiciel permet d'accéder à des pages web depuis votre ordinateur. Il en existe plusieurs téléchargeables gratuitement comme Google Chrome ou Mozilla. Certains sont même déjà installés comme Safari sur Mac OS et Edge sur Microsoft.", "options": {"use_cache": False, "use_gpu": True}}

sentence_embeddings1 = query(payload1)
print(sentence_embeddings1)

I got the following error: {'error': 'Model Matthieu/stsb-xlm-r-multilingual-custom is currently loading', 'estimated_time': 44.490336920000004}

Do I have to wait some time until the model is loaded for GPU inference?

Thanks!

Jun 09 '21 11:06 Matthieu-Tinycoaching

Maybe of interest to @Narsil

Jun 09 '21 15:06 LysandreJik

Hi @Matthieu-Tinycoaching This is linked to: huggingface/api-inference-community#26

Community images do not implement:

private models
GPU inference
Acceleration

So what you are seeing is quite normal and is expected. If you don't mind we should keep the discussion over there as all 3 are correlated.

Jun 09 '21 15:06 Narsil

Hi @Narsil thanks for the feedback.

However I don't understand so how I can test accelerated inference (CPU+GPU) API on my custom public model?

What is testable so on accelerated inference API and what should I benefit from the free trial startup plan from?

Jun 09 '21 16:06 Matthieu-Tinycoaching