WordGCN icon indicating copy to clipboard operation
WordGCN copied to clipboard

About using own text data for SynGCN and SemGCN

Open 40347015S opened this issue 4 years ago • 1 comments

Your WordGCN paper is very exciting and very well written, so I want to try to use your code in my current work, and I would like to ask you some questions. For training SynGCN and SemGCN, If I try to use other text data such as transcripts of speech recognition benchmark corpus (AMI) rather than the Wikipedia corpus and receive the AMI corpus-based SynGCN and SemGCN word embeddings, what is the first step I need to do, or how to process my own text data. Thanks!

Shih-Hsuan

40347015S avatar Apr 03 '21 08:04 40347015S

Hi Shih-Hsuan The corpus need to be arranged in the data.txt format which has been described in the readme. You'll have to run a dependency parser on your corpus so that you can get a dependency parse tree for each sentence

svjan5 avatar Apr 06 '21 04:04 svjan5