deepl-python icon indicating copy to clipboard operation
deepl-python copied to clipboard

Translation error with markup text

Open yichidev opened this issue 2 years ago • 3 comments

When translating the markup text via DeepL API, the newlines, \\n, are squeezed into the wrong tag (i.e., </c>). Please see the following example, where DE is the source and RU is the target :)

DE: Liste der Eigenschaften:\\n* <c id=\"f48c4591-9f64-4ac8-af6a-72228cd50793\">Kamera</c>\\n* <c id=\"882df0f7-39a5-4f99-ae8a-e23a485419ee\">Akku</c>\\n
RU: Список свойств:\\n* <c id=\"f48c4591-9f64-4ac8-af6a-72228cd50793\">Камера\\n*</c> <c id=\"882df0f7-39a5-4f99-ae8a-e23a485419ee\">Аккумулятор\\n</c>

The expectation would be that the 2nd and 3rd newlines still stay outside of the </c> tag, after translating.

yichidev avatar Mar 14 '23 10:03 yichidev

We have raised your issue with the relevant team. Thank you for reporting! We'll keep you posted if we find anything.

seekuehe avatar Apr 18 '23 09:04 seekuehe

hey @ChingYi-AX could you give us the exact request you've sent including the parameters etc?

DeeJayTC avatar Apr 19 '23 11:04 DeeJayTC

@seekuehe @DeeJayTC Thank you very much for your help and sorry for the late reply! Here is the exact request with the parameters:

_BASE_PARAMS = {
    "split_sentences": "nonewlines",
    "tag_handling": "xml",
    "non_splitting_tags": "b,c",
}
auth_header = {"Authorization": OUR_DEEPL_API_KEY}
texts = ["Liste der Eigenschaften:\\n* <c id=\"f48c4591-9f64-4ac8-af6a-72228cd50793\">Kamera</c>\\n* <c id=\"882df0f7-39a5-4f99-ae8a-e23a485419ee\">Akku</c>\\n"]

data = {"source_lang": "DE", "target_lang": "RU", **_BASE_PARAMS, "text": texts}
    
response = requests.post(
    "https://api.deepl.com/v2/translate",
    timeout=20,
    headers=auth_header,
    data=data,
)

yichidev avatar May 11 '23 09:05 yichidev