Gaurav Arora comments

Results 35 comments of


Gaurav Arora

identify languages doesn't work with Telugu in v0.9

@nitkannen Yes sure, this is still unresolved and it'll be great if you can contribute!

needs inltk support for Gujarati NER in Windows

The plan is to provide NER for all the Languages iNLTK, but that work is still in very early stages. I'm still working on collecting dataset. So, it's very difficult...

Urdu, Kashmiri and Maithili Support

Thanks for the initiative, shout out if you need any help!

Urdu, Kashmiri and Maithili Support

It's advisable to not go beyond vocab length of 30k. Are you talking about GPU memory ? I've GTX 1080 Ti with 11 GB memory on which I trained all...

Urdu, Kashmiri and Maithili Support

Good to know that it worked. Yes, 350 articles seems too less. Try if you can get data from somewhere else.. news articles/govt. Websites etc.

Urdu, Kashmiri and Maithili Support

@anuragshas Thanks for the contribution! Would you like to raise a PR to add your model to iNLTK (I can help you with the process)

Urdu, Kashmiri and Maithili Support

@anuragshas don't worry about all_languages_identifying_model. I will be fine tuning it to add Tamil language to iNLTK , I will tune it for Urdu as well. As far as LM...

Urdu, Kashmiri and Maithili Support

@anuragshas You've been working on LM for Maithili as well, right? Can you share the Wikipedia Dataset you would've prepared for it? Because tuning language-classifier model again for Maithili will...

Urdu, Kashmiri and Maithili Support

@anuragshas No issues! Good luck :).

Urdu, Kashmiri and Maithili Support

@anuragshas Once you've imported the Tokenizer, you need to load the pretrained model which you would have saved the last time, and then export. That is, just to be very...