Gaurav Arora

Results 35 comments of Gaurav Arora

@nitkannen Yes sure, this is still unresolved and it'll be great if you can contribute!

The plan is to provide NER for all the Languages iNLTK, but that work is still in very early stages. I'm still working on collecting dataset. So, it's very difficult...

Thanks for the initiative, shout out if you need any help!

It's advisable to not go beyond vocab length of 30k. Are you talking about GPU memory ? I've GTX 1080 Ti with 11 GB memory on which I trained all...

Good to know that it worked. Yes, 350 articles seems too less. Try if you can get data from somewhere else.. news articles/govt. Websites etc.

@anuragshas Thanks for the contribution! Would you like to raise a PR to add your model to iNLTK (I can help you with the process)

@anuragshas don't worry about all_languages_identifying_model. I will be fine tuning it to add Tamil language to iNLTK , I will tune it for Urdu as well. As far as LM...

@anuragshas You've been working on LM for Maithili as well, right? Can you share the Wikipedia Dataset you would've prepared for it? Because tuning language-classifier model again for Maithili will...

@anuragshas No issues! Good luck :).

@anuragshas Once you've imported the Tokenizer, you need to load the pretrained model which you would have saved the last time, and then export. That is, just to be very...