multiNLI icon indicating copy to clipboard operation
multiNLI copied to clipboard

GloVe - uneven vector size.

Open BrazilForever11 opened this issue 6 years ago • 1 comments

I downloaded GloVe from http://nlp.stanford.edu/data/glove.840B.300d.zip and ran:

path = '../path_to_glove/file.txt'
read_file = open(path, 'r', encoding="utf-8")
for i, el in enumerate(read_file):
    if len(el.split())>301:
        print(i)
        print(len(el.split()))

it generated:

52343
303

I was expecting all vectors to be of the same size. I am on Win 7 enterprise with python 3.5 It also causes error in loadEmbedding_rand function.

BrazilForever11 avatar Apr 06 '18 20:04 BrazilForever11

Follow this commit https://github.com/nyu-mll/multiNLI/pull/12/commits/1763f56208f45980f204972616cbcced676598b5 for fixing the bug.

santhoshrajamanickam avatar Jan 24 '19 12:01 santhoshrajamanickam