EmbeddingDynamicStereotypes icon indicating copy to clipboard operation
EmbeddingDynamicStereotypes copied to clipboard

Input Data

Open hiyamgh opened this issue 4 years ago • 1 comments

Hello, thanks for the repository!

I have followed the instructions in the README file but cannot get the vectors as described here

Basically these files:

filenames_sgns = [folder + 'vectors_sgns{}.txt'.format(x) for x in range(1910, 2000, 10)]
filenames_svd = [folder + 'vectors_svd{}.txt'.format(x) for x in range(1910, 2000, 10)]
filenames_nyt = [folder + 'vectors{}-{}.txt'.format(x, x+5) for x in range(1987, 2000, 1)]
filenames_coha = [folder + 'vectorscoha{}-{}.txt'.format(x, x+20) for x in range(1910, 2000, 10)]

Can you please let us know how to generate them ? I have for example downloaded the sgns from here but it contains only -vocab.pkl and -w.npy but not any .txt files.

hiyamgh avatar Jul 18 '21 10:07 hiyamgh

Check out the dataset_utilities/handle_cohagbooks_vectors.py file, which handles the conversion of -vocab.pkl and -w.npy files to .txt files.

agonzalezreyes avatar Aug 11 '22 20:08 agonzalezreyes