[WIP] InferenceClient.post is deprecated, but Sentence Ranking tasks are not implemented

Open Copilot opened this issue 9 months ago • 0 comments

Thanks for assigning this issue to me. I'm starting to work on it and will keep this PR's description up to date as I form a plan and make progress.

Original issue description:

Describe the bug

The InferenceClient API still does not support many of the tasks that can be hosted at inference endpoints, but gives a deprecation warning when using .post to get around this.

Reproduction

from huggingface_hub import InferenceClient, get_inference_endpoint
import json

# Get endpoint and create client
MODEL_NAME = "YOUR_MODEL_NAME_OR_ENDPOINT_NAME"
NAMESPACE = "YOUR_NAMESPACE"
endpoint = get_inference_endpoint(MODEL_NAME, namespace=NAMESPACE)
client = InferenceClient(endpoint.url, timeout=10)

# Test data
query = "What is the capital of France?"
document = "Paris is the capital of France."
sentence_ranking_style_inputs = [[query, document]]
text_classification_style_inputs = [{"text": query, "text_pair": document}]

# 1. Using post method
response_bytes = client.post(json={"inputs": sentence_ranking_style_inputs})
print(json.loads(response_bytes))
# Problem: post method has deprecation warning

# 2. Using text_classification task
try:
    result = client.text_classification(text_classification_style_inputs) # There's no way to inject the inputs format that would have worked on this task for reranking
    print(result)
except Exception as e:
    print(f"text_classification error: {e}")
# Problem: text_classification doesn't properly support text pairs format needed for reranking/cross-encoding

# 3. Using sentence_similarity task
try:
    result = client.sentence_similarity(sentence_ranking_style_inputs) # There's no direct way to inject the inputs format that would have worked on this task for reranking
    print(result)
except Exception as e:
    print(f"sentence_similarity error: {e}")
# Problem: No direct support for sentence ranking despite endpoint supporting this task

Ideally, the sentence_ranking task is supported.

Logs

System info

- huggingface_hub version: 0.30.2
- Platform: macOS-15.4-arm64-arm-64bit
- Python version: 3.12.10
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: redacted
- Has saved token ?: False
- Configured git credential helpers: redacted
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.7.0
- Jinja2: 3.1.6
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 11.2.1
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.11.4
- aiohttp: 3.11.18
- hf_xet: N/A
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: redacted
- HF_ASSETS_CACHE: redacted
- HF_TOKEN_PATH: redacted
- HF_STORED_TOKENS_PATH: redacted
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

Fixes #3055.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

May 23 '25 17:05 Copilot