socialsent
socialsent copied to clipboard
On the use of cooccurgen.py
Dear William, I’d like to use your code to induce a sentiment lexicon from a new corpus. In your answer to the issue #8, you wrote that the first step is to “Use representations/cooccurgen.py to process a corpus and construct co-occurrence matrices.” By looking at cooccurgen.py, it seems that it takes in input a corpus in the COHA word_lemma_pos format and it also needs a file called index.pkl.
- Do I have to transform my corpus into a tabular format like the COHA format?
- How is the index.pkl file created?
- Is there any way to use the script starting from a raw corpus?
Thanks a lot in advance! Best, Rachele