UGT icon indicating copy to clipboard operation
UGT copied to clipboard

Vertical japanese doesn't translate well

Open Meerkov opened this issue 3 years ago • 3 comments

Though uncommon, some games use top-to-bottom (and right-to-left) written Japanese. I found that the system struggled to identify a block of text that was only 1 character wide.

And let's say that there was a block of text that was 3 characters wide by 10 characters tall. Then it would translate it as if it was 10 lines of text that were 3 characters each, resulting in a garbled mess.

Furthermore, the translation would then try to fit this garbled mess of english into a vertical format that it doesn't really fit in...

I'm not sure how this should be fixed, as it's likely a failure on the Cloud side... but maybe there is a way to throw in a hack whenever a block of japanese text that is much taller than wide? Probably between the OCR step and the translate step?

Meerkov avatar Apr 18 '21 22:04 Meerkov

Yeah, Google's stuff can't handle this yet.

It should be possible to manually piece characters together and send that for translation (we do have the position of each character on the screen, not just each "word" or whatever) but I currently don't have plans to add this.

SethRobinson avatar Apr 20 '21 21:04 SethRobinson

Note, this has changed! Google does properly handle vertical text now, when testing with examples on https://w3c.github.io/i18n-drafts/articles/vertical-text/index.en the OCR does fine.

And while it's possible to click and hear the correct Japanese (or English translation) being spoken, it's formatted horizontally (by UGT) so it's difficult to read.

Will have to give that some thought on formatting, but good to know the Google side can do it now.

SethRobinson avatar Jul 21 '21 10:07 SethRobinson

I found out apparently you can also try to set the model to "builtin/latest", which gives the newest features. Apparently vertical text detection was available in that model for 2 years, according to a blogpost I saw. It might be worth trying that setting to see if it makes a difference in the quality of the detection

Meerkov avatar Jul 22 '21 21:07 Meerkov