llama3 icon indicating copy to clipboard operation
llama3 copied to clipboard

Potential for Controversy in Generation

Open HaShaWB opened this issue 2 months ago • 2 comments

It appears that LLAMA may not sufficiently understand East Asian cultures. Notably, when the term 'Korean' is mentioned, the model occasionally uses Japanese or Chinese greetings. Furthermore, when requested to generate responses in Korean, the outputs sometimes contain a mix of Chinese or Japanese elements, which could lead to controversy.

HaShaWB avatar Apr 22 '24 16:04 HaShaWB

Exactly why we have to pretrain and finetune again !

thusinh1969 avatar Apr 23 '24 06:04 thusinh1969

Thank you for pointing this out! Even though the tokenizer has multilingual vocabulary, currently Llama3 doesn't support multilingual inference. Currently the models are officially supported for inference in English, but as @thusinh1969 mentions, finetuning is an option here. We have an example using Llama 2 here : https://github.com/meta-llama/llama-recipes/tree/main/recipes/multilingual

subramen avatar Apr 24 '24 17:04 subramen