Structured-Self-Attentive-Sentence-Embedding
Structured-Self-Attentive-Sentence-Embedding copied to clipboard
About GLOVE model
Recently, I have use torchtext to get the glove model, By this module I got the dictionary that maps word to index and the embedding matrix (shape word_count * dim, torch.FloatTensor), so to create the file which can be used in train.py, I write my code like this:
t=(dictionary, embedding matrix, dim)
torch.save(t, mypath/glove.pt)
Is the file glove.pt in the right format that asked in your program?
This is how I created the GloVe model :
TEXT = data.Field(sequential=True)
LABEL = data.Field(sequential=False)
train, val, test = data.TabularDataset.splits(
path='./', train='train.json',
validation='val.json', test='test.json', format='json',
fields={'text': ('text', TEXT),
'label': ('label', LABEL)})
TEXT.build_vocab(train, vectors="glove.42B.300d")
dictionary = TEXT.vocab.stoi
vectors = TEXT.vocab.vectors
dim = TEXT.vocab.vectors.size()[1] #300 in this case
torch.save(tuple([dictionary,vectors,dim]), './GloVe/glove.42B.300d.pt')
Took inspiration from :
-
http://anie.me/On-Torchtext/
-
Lines 24-31 of https://github.com/pytorch/examples/blob/master/snli/train.py