argos-translate
argos-translate copied to clipboard
Doesn't understand Chinese
Hello.
How are you?
I'm too/very tired
Yes, you are right. idiot
Yes the Chinese translations aren't very good. I think the root cause is that there isn't very much data available for Chinese.
Looks like there's much data for Chinese. https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/master/data/README-v2021-08-07.md
Can someone train an argos package with it please? I really need good Chinese to English translation
Maybe you can help us train a better Chinese model @rafael3382 see https://github.com/argosopentech/argos-train
The Chinese model was updated recently hopefully the new one is better.
https://community.libretranslate.com/t/improving-chinese-translations/364/
If we can find more data we could retrain again too.
still bad. How many GPU cards need if I want to train it?
https://huggingface.co/Helsinki-NLP/opus-mt-zh-en does a pretty good job, I wonder if we can use that.
# pip install torch
# pip install sentencepiece
# pip install sacremoses
from transformers import MarianMTModel, MarianTokenizer
def chinese_to_english(text):
model_name = 'Helsinki-NLP/opus-mt-zh-en'
model = MarianMTModel.from_pretrained(model_name)
tokenizer = MarianTokenizer.from_pretrained(model_name)
# Tokenize the text
tokenized_text = tokenizer.encode(text, return_tensors="pt")
# Translate the tokenized text
translated_tokens = model.generate(tokenized_text)
# Decode the translated tokens to a string
translated_text = tokenizer.decode(translated_tokens[0], skip_special_tokens=True)
return translated_text
if __name__ == "__main__":
chinese_text = input("Enter Chinese text: ")
translated_text = chinese_to_english(chinese_text)
print(f"Translated Text: {translated_text}")```
New Chinese simplified/traditional models (from OPUS-MT) are up:
https://libretranslate.com/?source=zh&target=en&q=%E4%BD%A0%E5%A5%BD
How do they score?
Link to models thread: https://community.libretranslate.com/t/opus-mt-language-models-port-thread/757/2
Thanks @pierotofy! After reinstalling not only zh
but also pl
is now working great : )