deepl-translate icon indicating copy to clipboard operation
deepl-translate copied to clipboard

Can we try to save \n while translating text?

Open CaramelHeaven opened this issue 3 years ago • 2 comments

For example, I want to translate

Lalalal

lalalal


Using deepl I have got -> Lalalal lalalal with cut off \n symbols and so on

CaramelHeaven avatar Jan 19 '22 17:01 CaramelHeaven

Currently, this package simply passes the text as is to the deepl API. It is the deepl API itself that ignores any new line character from the input:

from deepl.api import split_into_sentences

text = """Lalalal

lalalal"""

sentences = split_into_sentences(text)
print(sentences)

Output:

['Lalalal\n\nlalalal']
from deepl.api import generate_translation_request_data

sentences = split_into_sentences(text)
data = generate_translation_request_data(
    source_language="DE", target_language="EN", sentences=sentences
)

data["params"]["jobs"][0]["raw_en_sentence"]

Output:

'Lalalal\n\nlalalal'
import json
import requests

from deepl.api import headers
from deepl.settings import API_URL

response = requests.post(API_URL, data=json.dumps(data), headers=headers)
json_response = response.json()
json_response["result"]["translations"][0]["beams"][0]["postprocessed_sentence"]

Output:

'Lalalal lalalal'  # no newline characters

I'm not quite sure yet how the web version of deepl handles the new lines in the text (I assume some javascript preprocessing). So I would have to do some text preprocessing before requesting the translation, which could lead to unexpected or corrupted behavior. This kind of code change should be thoroughly tested, which I don't have time to do at the moment.

If anyone can point me to a safe solution, I'd be happy to look into it.

ptrstn avatar Jan 20 '22 11:01 ptrstn

I've seen something like a ignore_tag argument in the deepl documentation, is it a viable solution? Workaround way is just translating it line by line with for loop and use threading to speed up the process while keeping the line structure intact.

OtwakO avatar Feb 20 '22 13:02 OtwakO