Thomas Roten
Thomas Roten
Hello @glowinthedark! So, [`hanzi.to_pinyin()`](https://dragonmapper.readthedocs.io/en/latest/api.html?highlight=to_pinyin#dragonmapper.hanzi.to_pinyin)'s `delimiter` argument is referring to the Chinese character source string. It's used to partition the string by words rather than characters, allowing for a more accurate...
@ogmartins Great point! Feel free to open a pull request to add this transcription to the CSV file. Thanks!
@TTWNO This is a neat idea! Thanks for the pull request. I'm at a conference right now, but I'll test/review it as soon as I can. Thanks!
@TTWNO Looking now :)
``` $ python >>> import pynlpir >>> pynlpir.open() >>> s = u'欢迎科研人员、技术工程师、企事业单位与个人参与NLPIR平台的建设工作。' >>> tokens = pynlpir.segment(s) >>> for word, pos in tokens: ... print(word) 欢迎 科研 人员 、 技术 工程师...
[NLPIR说](https://github.com/NLPIR-team/NLPIR/blob/master/NLPIR%20SDK/NLPIR-ICTCLAS/doc/NLPIR-ICTCLAS%E5%88%86%E8%AF%8D%E7%B3%BB%E7%BB%9F%E5%BC%80%E5%8F%91%E6%89%8B%E5%86%8C2017%E7%89%88%20.pdf): ``` 用户词典需要注意的事项还包括: 1. 如果用户词有空格,需要采用[]括起来,例如: [Bill Clinton] nrf 2. 如果需要该用户词作为文章的关键词输出,必须用户词性标注为:key,如:科学发展观 key 3. 如果将一个词是人名,同时又希望作为关键词输出,则需要标注为 key_nr,如 钟南山 key_nr 4. 如果将一个词是地名,同时又希望作为关键词输出,则需要标注为 key_ns,如 钓鱼岛 key_ns 5. 如果将一个词是机构名,同时又希望作为关键词输出,则需要标注为 key_nr,如 国安 委 key_nt ``` 例子: -...
Hello! It is probably related to `sudo`. If you used `sudo` to install pynlpir, then you might need to use `sudo python` when running your script. Or, you could update...
Hello @stickjitb! Unfortunately, NLPIR (the library we use to segment text), uses `/` as the separator between tokens that it segments. Here is a typical example that NLPIR returns for...
We could get fancy in how we process the tokens from NLPIR by using look-ahead assertions in a regular expression (like only splitting on `/` if it has `[a-z]` immediately...
@linshizhen , 你不需要下载最新的nlpir.user 更新: ``` $ pip install pynlpir $ pynlpir update $ python3 >>> import pynlpir >>> pynlpir.open() >>> pynlpir.segment('这是一个句子', pos_english=False) [('这', '代词'), ('是', '动词'), ('一个', '数词'), ('句子', '名词')]...