lindat-translation icon indicating copy to clipboard operation
lindat-translation copied to clipboard

"I don't know." as translation of empty lines. CS–EN

Open stranak opened this issue 1 year ago • 3 comments

See the screenshot. I haven't tested other directions, models, etc.

Screenshot 2023-09-29 at 19 46 53

stranak avatar Sep 29 '23 17:09 stranak

I confirm this bug. In future, I plan to train the models so that empty line should be translated as empty line (i.e. include few such examples in the training data), but that won't affect the already trained models. So we need to change also the frontend and/or API server so that it does not query the backend with empty lines.

martinpopel avatar Sep 29 '23 20:09 martinpopel

@stranak @martinpopel Huh? What kind of newlines are that? \n (but probably only those) should be handled by the backend already. Would be surprised if the old FE did some sort of normalization; but maybe it did...

image

kosarko avatar Oct 04 '23 14:10 kosarko

It is the special kind, where there are some spaces on those lines :-) Sorry for not catching it myself. Those lines were not empty, each one had a single space on it.

I can now confirm, that Martin's models do this for any direction I tried, they always have to generate something for "empty" lines (paragraphs) containing spaces. It seems that the "something" is model-specific. Screenshot 2023-10-04 at 16 21 36

stranak avatar Oct 04 '23 14:10 stranak