Alpaca-Turbo icon indicating copy to clipboard operation
Alpaca-Turbo copied to clipboard

character encoding of non-english characters seem to be faulty

Open mariokoehler opened this issue 2 years ago • 3 comments

while using the "translator" mode, translating some english into german, i noticed that german special characters (ö, ä, ü, ß) show up "garbled" in the output of the UI.

for example, translating the following short text taken from a cnn.com article:

The State Emergency Service for the Kyiv region has told CNN that the number of people killed in a Russian drone strike that hit a residential building in the town of Rzhyshchiv Tuesday night has risen to seven.

results in this garbled output:

Der Staatliche Notfalldienst für die Region Kiew hat CNN mitgeteilt, dass der Anzahl an Menschen getötet durch einen russischen Drohnenangriff auf ein Wohngebäude in dem Ort Rzhyshchiw gestern Abend auf sieben angestiegen ist.

the correct display would be:

Der Staatliche Notfalldienst für die Region Kiew hat CNN mitgeteilt, dass der Anzahl an Menschen getötet durch einen russischen Drohnenangriff auf ein Wohngebäude in dem Ort Rzhyshchiw gestern Abend auf sieben angestiegen ist.

it's probably some character encoding related issue. the "garbled" presentation looks to me like a UTF8 encoded string that is not decoded using UTF8 but some other encoding, maybe ANSI (because i'm on a german Windows machine and maybe the UI uses some sort of system default if not explicitly told what encoding to use?)

mariokoehler avatar Mar 22 '23 14:03 mariokoehler

From what I seen on other discussion boards specially for dalai it's the language package you have installed on your computer where it uses the characters from. If you noticed the pattern on the translation, all the errors occur where there are special characters.

I had a similar issue, but it's actually with the default for English and outputting of strange characters. Not related to this topic much.

I'm sure you already have the German language pack installed on your computer. You can try to make sure it's also enabled on the browser if that's even an option. I'm not too sure how character selection works, but I'd assume it those packages would come from the browser or the system.

Lolagatorade avatar Mar 23 '23 21:03 Lolagatorade

Added Unicode support can you please verify if that is fixed

ViperX7 avatar Mar 24 '23 13:03 ViperX7

Cyrrilic symbols now are okay. But it seems like stuck on first word. Cpu load show that win.exe is working, but python webui.py donot show any symbols anymore. изображение

I happens in 90% chance if use cyrrilic.

DaveScream avatar Mar 26 '23 16:03 DaveScream