Languages
Can you mention what languages covered in this dataset? based on the arXiv:2302.13971v1, LLaMA only covers this kind of languages : bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk. Is there possible to add some new low resources languages, like Indonesian for example. Thanks
Same question.
That's correct, we cover the same set of languages and these come from the wikipedia slice of the dataset. We will add support for more languages in the future (also low resource ones).
That's correct, we cover the same set of languages and these come from the wikipedia slice of the dataset. We will add support for more languages in the future (also low resource ones).
Does the dataset currently contain Chinese resources?
no, currently we only have the following languages in the dataset: bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk. We are planning to add support for more languages in the future.