lemonhu comments

Results 38 comments of


                                            lemonhu

Id 'xxx' is defined more than once in group 'global id space'

谢谢反馈。我已经做过重名处理了，可见[build_csv.py#L34](https://github.com/lemonhu/stock-knowledge-graph/blob/master/build_csv.py#L34)，将董事的`name`，`gender` 和 `age`一起做MD5作为唯一ID，以解决重名问题。

Id 'xxx' is defined more than once in group 'global id space'

基于MD5的实体唯一性确定规则，这里的两个`姚波`应该属于同一个人，不应该有重复的ID(实际上重复也不会有影响)。

Id 'xxx' is defined more than once in group 'global id space'

针对你提到的第二个问题，根据程序逻辑应该不会出现，上述问题可以追溯到`executive_prep.csv`文件的`code`列，可见`extract.py`代码文件[#L23](https://github.com/lemonhu/stock-knowledge-graph/blob/master/extract.py#L23) 和 [#L24](https://github.com/lemonhu/stock-knowledge-graph/blob/master/extract.py#L24) ，实际在我本地测试时也未出现此bug哈。

可以考虑接入第三方评论系统么？

谢谢回复，有时间再试一下。

open-entity-relation-extraction/code/core/entity_combine.py 的第55行注释问题

好的，谢谢你的建议。

为什么我给的简单句子，大部分都提取不出三元组，比如图中的例子。

目前只能保证7种DSNF范式，而且这个前提是依存句法正确得到解析。

为什么我给的简单句子，大部分都提取不出三元组，比如图中的例子。

添加用户词典，只是有助于分词这一步。建立关系的候选实体集合为{'ns', 'ni', 'nh', 'nz', 'j'}，可见[entity_combine.py](https://github.com/lemonhu/open-entity-relation-extraction/blob/master/code/core/entity_combine.py#L84)。

为什么我给的简单句子，大部分都提取不出三元组，比如图中的例子。

这份工作的贡献就是基于依存句法制定抽取范式，我认为可以从以下两个方面提高： 1. 制定更多的抽取范式，而范式的制定需要较深的语言学知识，当然规则的复杂性可能会随之增加。 2. 实际测试中，长句子的依存句法解析依然比较困难，可以尝试解决长句子的依存解析不准的问题。

asking

The problem is that the ltp models are not loaded and they can be downloaded from http://ltp.ai/download.html, select `ltp_data_v3.4.0.zip`.

asking

@Max1110 Thank you for your attention. The question is too general. Can you post the code snippet with problem.