TamilNLP icon indicating copy to clipboard operation
TamilNLP copied to clipboard

Add Kaggle and Transformers model

Open mapmeld opened this issue 5 years ago • 0 comments

I found multiple Tamil classification datasets at https://www.kaggle.com/sudalairajkumar/tamil-nlp

I trained a small transformers model using CommonCrawl and latest (1 July) Tamil Wikipedia. On the model site I link to a notebook which shows examples of matching or outperforming Multilingual BERT: https://huggingface.co/monsoon-nlp/tamillion There are some other tasks which can be done with it using SimpleTransformers, but there are not many datasets for training them (for example: question answering)

mapmeld avatar Jul 30 '20 13:07 mapmeld