Wannaphong Phatthiyaphaibun
Wannaphong Phatthiyaphaibun
## Description I try to tokenize text with `"ทดสอบตัดคำภาษาไทยจอก์น"`. ## Expected results `['ทดสอบ', 'ตัด', 'คำ', 'ภาษาไทย', 'จอก์', 'น']` ## Current results `['ทดสอบ', 'ตัด', 'คำ', 'ภาษาไทย', 'จอก', '์น']` ## Steps to...
In the next time, I think we should porting pythainlp model to onnx model. For onnx model, It's model standard to other framework and It can use OS framework to...
After `python-crfsuite` fixed python 3.10 problem but they doesn't release new version to PyPI. https://github.com/scrapinghub/python-crfsuite/issues/139 I think we should change all python-crfsuite models to Pytorch models.
## Trie - [ ] Add trie for OOV words - @korakot ## Dependency parsing - [x] Add dependency parsing to PyThaiNLP #606 [WIP]
After NECTEC released Blackboard Treebank, We want to add dependency parsing from Blackboard Treebank to PyThaiNLP 3.0. Facebook: https://web.facebook.com/dancearmy/posts/10158423653343284 bitbucket: https://bitbucket.org/kaamanita/blackboard-treebank/
## Detailed description From [Thai NLP Meetup #7](https://web.facebook.com/AIResearch.in.th/videos/1474022956330608), II get feedback from the user. They want to builder tools in pythainlp for using own model in pythainlp. ## Context benefit:...
We want to write new IPA implementation for `pythainlp.transliterate.transliterate`. - This is to replace [epitran](https://github.com/dmort27/epitran/) and remove dependency on [marisa-trie](https://github.com/pytries/marisa-trie). - In some situations, marisa-trie has a portability issue with...
**Describe the bug** Text normalization not working in some cases such as `'เค้้้าเดินไปสนามหญา้หนา้บา้น'` output '`เค้้้าเดินไปสนามหญ้าหน้าบ้าน'` and `'พ ุ่มดอกไม้ในสนามหญา้หนา้บา้น'` output `'พ ุ่มดอกไม้ในสนามหญ้าหน้าบ้าน'.` **To Reproduce** Steps to reproduce the behavior: 1. `from...
If you are interested in supporting Thai Natural Language Processing (ThaiNLP), We have a backlog. - Build Open Source Text to Speech: You can add the Thai language to open...