DC-Match icon indicating copy to clipboard operation
DC-Match copied to clipboard

您好,我想问一下,关键词是如何通过引入外部知识库得到的呢,这里具体怎么实现呢

Open zyx214 opened this issue 2 years ago • 5 comments

zyx214 avatar Sep 09 '22 13:09 zyx214

具体做法在论文中的4.2节。

  1. 首先通过nltk标记出所有的动词、名词、形容词等可能带有实际含义的词或短语(按最长的词或短语匹配)。
  2. 其次,将这些词和短语与知识库中的概念进行匹配,如果在知识库中出现,就视为关键词。

RowitZou avatar Sep 11 '22 05:09 RowitZou

您好,我也是这个问题,请问关键词通过引入外部知识库进行分类这部分内容在代码的哪部分实现了呢,我拜读了整个py文件,也没有找到,也许是我的问题,可以麻烦您给个方向么?

wjzqbb1234 avatar Oct 06 '22 03:10 wjzqbb1234

Hello everyone there! Thanks for your @RowitZou amazing work and well-organized code published in this repository. While please forgive me: the current version is on 2022/10/8, and I can't discover code about how to disentangle keywords. Also, I tried to search words in the repository using "nltk" as mentioned above but failed. Therefore, it would be really nice of you if you could further release the corresponding code.

BruceStayHumble avatar Oct 08 '22 15:10 BruceStayHumble

Hi, there. I am sorry that we could not find the keyword processing codes anymore. The processing was conducted as a pre-task before training and it was a rule-based approach. The processing results are a part of our released datasets as you can see. Any other solutions to get the keywords, e.g., human annotation and automatic extraction, are practicable to the application.

RowitZou avatar Nov 17 '22 10:11 RowitZou

具体做法在论文中的4.2节。

  1. 首先通过nltk标记出所有的动词、名词、形容词等可能带有实际含义的词或短语(按最长的词或短语匹配)。
  2. 其次,将这些词和短语与知识库中的概念进行匹配,如果在知识库中出现,就视为关键词。

想问一下这个知识库在哪里能找到呢,是自定义的吗?

Tylerjoe avatar Mar 06 '23 03:03 Tylerjoe