multiNLI
multiNLI copied to clipboard
GloVe - uneven vector size.
I downloaded GloVe from http://nlp.stanford.edu/data/glove.840B.300d.zip and ran:
path = '../path_to_glove/file.txt'
read_file = open(path, 'r', encoding="utf-8")
for i, el in enumerate(read_file):
if len(el.split())>301:
print(i)
print(len(el.split()))
it generated:
52343
303
I was expecting all vectors to be of the same size. I am on Win 7 enterprise with python 3.5 It also causes error in loadEmbedding_rand function.
Follow this commit https://github.com/nyu-mll/multiNLI/pull/12/commits/1763f56208f45980f204972616cbcced676598b5 for fixing the bug.