open-lid-dataset icon indicating copy to clipboard operation
open-lid-dataset copied to clipboard

Resources for Kurdish

Open ZJaume opened this issue 1 year ago • 2 comments

1M tokens / 156k sentences in several varieties of Central Kurdish: https://github.com/sinaahmadi/CORDI

and a Kurdish LID models and datasets: https://github.com/sinaahmadi/KurdishLID

ZJaume avatar Nov 18 '24 10:11 ZJaume

More: https://github.com/sinaahmadi/awesome-kurdish

KTC corpus seems useful and maybe others

ZJaume avatar Feb 21 '25 17:02 ZJaume

Cheers Jaume, I'll look to fold it in to the next release.

laurieburchell avatar Feb 23 '25 09:02 laurieburchell