DeepPavlov icon indicating copy to clipboard operation
DeepPavlov copied to clipboard

GloveEmbedder fails when no header row present in embeddings file

Open oserikov opened this issue 4 years ago • 0 comments

Want to contribute to DeepPavlov? Please read the contributing guideline first.

Please enter all the information below, otherwise your issue may be closed without a warning.

DeepPavlov version (you can look it up by running pip show deeppavlov): 0.13

Python version: 3.8

Operating system (ubuntu linux, windows, ...): ubuntu 18.04

Issue: embeddings stored in .txt could have no header row w. vocab size and embedding width. in such cases the glove embedder fails to read vectors from file.

Steps to reproduce: try to read e.g. the glove_ru_300_wordpunct file (curl -s -XGET files.deeppavlov.ai/embeddings/glove_300_ru_wiki_lenta_nltk_wordpunct_tokenize/glove_300_ru_wiki_lenta_nltk_wordpunct_tokenize.txt | head -n 3 | cut -d' ' -f-4 to preview)

oserikov avatar Dec 09 '20 06:12 oserikov