huggingface_hub
huggingface_hub copied to clipboard
[WIP] InferenceClient.post is deprecated, but Sentence Ranking tasks are not implemented
Thanks for assigning this issue to me. I'm starting to work on it and will keep this PR's description up to date as I form a plan and make progress.
Original issue description:
Describe the bug
The InferenceClient API still does not support many of the tasks that can be hosted at inference endpoints, but gives a deprecation warning when using
.postto get around this.Reproduction
from huggingface_hub import InferenceClient, get_inference_endpoint import json # Get endpoint and create client MODEL_NAME = "YOUR_MODEL_NAME_OR_ENDPOINT_NAME" NAMESPACE = "YOUR_NAMESPACE" endpoint = get_inference_endpoint(MODEL_NAME, namespace=NAMESPACE) client = InferenceClient(endpoint.url, timeout=10) # Test data query = "What is the capital of France?" document = "Paris is the capital of France." sentence_ranking_style_inputs = [[query, document]] text_classification_style_inputs = [{"text": query, "text_pair": document}] # 1. Using post method response_bytes = client.post(json={"inputs": sentence_ranking_style_inputs}) print(json.loads(response_bytes)) # Problem: post method has deprecation warning # 2. Using text_classification task try: result = client.text_classification(text_classification_style_inputs) # There's no way to inject the inputs format that would have worked on this task for reranking print(result) except Exception as e: print(f"text_classification error: {e}") # Problem: text_classification doesn't properly support text pairs format needed for reranking/cross-encoding # 3. Using sentence_similarity task try: result = client.sentence_similarity(sentence_ranking_style_inputs) # There's no direct way to inject the inputs format that would have worked on this task for reranking print(result) except Exception as e: print(f"sentence_similarity error: {e}") # Problem: No direct support for sentence ranking despite endpoint supporting this taskIdeally, the sentence_ranking task is supported.
Logs
System info
- huggingface_hub version: 0.30.2 - Platform: macOS-15.4-arm64-arm-64bit - Python version: 3.12.10 - Running in iPython ?: No - Running in notebook ?: No - Running in Google Colab ?: No - Running in Google Colab Enterprise ?: No - Token path ?: redacted - Has saved token ?: False - Configured git credential helpers: redacted - FastAI: N/A - Tensorflow: N/A - Torch: 2.7.0 - Jinja2: 3.1.6 - Graphviz: N/A - keras: N/A - Pydot: N/A - Pillow: 11.2.1 - hf_transfer: N/A - gradio: N/A - tensorboard: N/A - numpy: 1.26.4 - pydantic: 2.11.4 - aiohttp: 3.11.18 - hf_xet: N/A - ENDPOINT: https://huggingface.co - HF_HUB_CACHE: redacted - HF_ASSETS_CACHE: redacted - HF_TOKEN_PATH: redacted - HF_STORED_TOKENS_PATH: redacted - HF_HUB_OFFLINE: False - HF_HUB_DISABLE_TELEMETRY: False - HF_HUB_DISABLE_PROGRESS_BARS: None - HF_HUB_DISABLE_SYMLINKS_WARNING: False - HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False - HF_HUB_DISABLE_IMPLICIT_TOKEN: False - HF_HUB_ENABLE_HF_TRANSFER: False - HF_HUB_ETAG_TIMEOUT: 10 - HF_HUB_DOWNLOAD_TIMEOUT: 10
Fixes #3055.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.