simple-llm-finetuner icon indicating copy to clipboard operation
simple-llm-finetuner copied to clipboard

Finetuning in unsupported language

Open jumasheff opened this issue 1 year ago • 2 comments

My language was not on the list of 20 languages the original model was trained on. Is it possible to finetune llama with a dataset in a language that was not included in the base model?

jumasheff avatar Mar 23 '23 08:03 jumasheff

I tried finetuning GPT2 which apparently was only trained with english language with german input text and the result is bad. I guess it takes a whole lot of more training in order to generate a new language.

In short: I think you probably need another model

hmrc87 avatar Mar 23 '23 09:03 hmrc87

You can absolutely do this. Tell me how it goes!

lxe avatar Mar 25 '23 05:03 lxe