HGAT
HGAT copied to clipboard
请问主题节点和文档节点之间的边权重是怎么构建的呢?
Are you taking about edge feature? I don't think edge features are used here.
但图卷积里面需要节点特征矩阵和边权重矩阵呀?代码里有词-词边权重计算,文档-词边权重计算,没有找到主题节点和文档节点边权重,只有一个LDA和最相关主题
by edge features, do you mean "similarity score" for entity-entity edge and "rho", "link probability" for entity-document edge? If yes, then they are only used to check if there is edge between the nodes or not.
If sim < 0.5 or lp <= 0.75 or rho <= 0.3: edge = False else: edge = True
from build_data.py
(around line 150) :
For document-topic edge, its something like:
edge = True if topic in document else False
from build_data.py
(around line 135):
so I think there is no edge features involved. Only 0 and 1.
、
是的,我也是看的这里,感觉很疑惑,难道不需要文档-主题边的权重值么?仅用0和1来表示么
就是类似"similarity score" for entity-entity edge and "rho", "link probability" for entity-document edge这样,这里肯定不是0和1 吧
I mean 0 and 1 for entity-entity and entity-document too.
We can know this for sure by looking at the output files. Only the o/p files are used by model later. The main files are:
1: dataset.cites
,
2: dataset.content.entity
3: dataset.content.text
4: dataset.content.topic
2,3,4 contains node features which has nothing to do with entity feature. 1 contains edge in terms of node pairs. So, if a pair exists, then there is edge, else there isn't. So there is no information about edge features.
I made a flow chart, this might also be useful. Please let me know if there are any errors.
这样的话,在建图之前没有引入主题么? 我有个想法,给每个文本分配TOPk个主题,那每个主题的概率可以作为边权重么