huggingface_hub [InferenceClient - Automatic speech recognition] Infer content type header based on file extension

Follow-up issue after https://github.com/huggingface/huggingface_hub/issues/2706. The initial intent of the issue is still left to be done. Sending a content-type as header can be done like this:

client = InferenceClient(url, headers={"Content-Type": "audio/mpeg"})
response = client.automatic_speech_recognition("audio.mp3")

would be even better to automatically infer the content type header based on file extension. Only possible when input is passed as a filepath, not raw bytes.

Note: this is not specific to automatic_speech_recognition but to any "binary-only" task (AST, audio-to-audio, image classification, image-to-xxx, etc.). Only useful when binary is sent alone, not when sent as base64-encoded with other parameters.

Feb 03 '25 16:02 Wauplin

Hi @Wauplin🤗, I would love to contribute to this. Looking forward to your response!

Best regards

Feb 07 '25 17:02 WizKnight

Done as part of https://github.com/huggingface/huggingface_hub/pull/3321

Oct 01 '25 09:10 Wauplin