azure-docs icon indicating copy to clipboard operation
azure-docs copied to clipboard

PDF support in Azure Translator synchronous document translation API

Open micander opened this issue 9 months ago • 1 comments

Hello, I am seeking to use the synchronous document translation API to perform PDF translation. I am unable to complete my API requests because the API always responds with {"code":"InvalidFormat","message":"The format parameter is not valid."}. I have tried various source and destination languages and various PDF files, but have had no success. I can translate other formats successfully such as DOCX.

The API documentation doesn't explicitly confirm that the synchronous API supports PDF translation. Is the feature unimplemented, or just currently broken? I have seen another person asking about the same issue on the community support forum a month ago. Thank you.

Example API call: curl -i -X POST "https://eelec-translation-uswest.cognitiveservices.azure.com/translator/document:translate?sourceLanguage=en&targetLanguage=hi&api-version=2023-11-01-preview" -H "Ocp-Apim-Subscription-Key:<omitted>" -H "X-ClientTraceId:pdf-translation-issue-b4e3ae21" -F "[email protected];type=application/pdf" -o "out.pdf"

Response:

content-type: application/json; charset=utf-8
access-control-expose-headers: X-RequestId,x-ms-request-id,X-RequestId-Forwarded,X-Metered-Usage
x-requestid: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-ms-request-id: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-ms-error-code: InvalidRequest
strict-transport-security: max-age=31536000; includeSubDomains; preload
apim-request-id: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-content-type-options: nosniff
x-ms-region: West US
date: Mon, 29 Apr 2024 04:38:09 GMT
{"error":{"code":"InvalidRequest","message":"The format parameter is not valid.","target":"ContentType","innerError":{"code":"InvalidFormat","message":"The format parameter is not valid."}}}```

micander avatar Apr 29 '24 21:04 micander

It appears that PDF it is not a supported format in the synchronous API after all: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/overview#synchronous-supported-document-formats

So this ticket becomes a report that the link to the list of supported formats does not go to the correct page. See here: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/reference/synchronous-rest-api-guide#request-body

The link "supported document formats" links to https://learn.microsoft.com/en-us/azure/ai-services/translator/language-support but it should link the the link above. I was not able to find this page on my own despite searching. This mis-link should be fixed.

Also, is it possible to know if PDF support is a planned feature for the synchronous API?

Thanks

micander avatar Apr 29 '24 22:04 micander

@micander Thanks for your feedback! We will investigate and update as appropriate.

PesalaPavan avatar Apr 30 '24 04:04 PesalaPavan

@micander Thank you for bringing this to our attention. I've delegated this to content author @laujan, who will review it and offer their insightful opinions.

Naveenommi-MSFT avatar May 02 '24 10:05 Naveenommi-MSFT