sanskrit
sanskrit copied to clipboard
Interest in implementing your models
Dear @OliverHellwig , I am one of the maintainers of the https://github.com/cltk/cltk, which is an NLP framework for dead languages. Recently @sainimohit23 alerted us to your sandhi splitting project -- well done, this is an important advancement for Sanskrit!
There is not an open source license in this repo -- would you consider sharing your code and models with the CLTK? If you would like to implement it yourself, or to mentor a student in implementing it, you are most welcome to join us!
Please reach out by email if you want to talk 1:1 (my address is on my GH homepage). Thank you!
@kylepjohnson @OliverHellwig I have started working on it. Right now I'm experimenting and doing some testing on the code.
I would also like to request @OliverHellwig to help me out if I ever get stuck. It would be great learning experience for a student like me who is interested in NLP.
@sainimohit23 any luck for now?
@gasyoun I implemented the whole pipeline very next day I commented on this issue. Here's the code: https://github.com/sainimohit23/cltk/tree/master/cltk/stem/sanskrit/code
@gasyoun @sainimohit23 Just reaching out to let you know that a CLTK person has alerted us to the CONLL treebanks here. I don't recall them being available last I checked, which according to this ticket was two years ago!
We are thinking of using these to make models with spacy. If you or anyone you know would be interested in helping, you're welcome to join us. I have only done small amounts of Sanskrit and we'll need knowledgable people to evaluate them.