DeepPavlov
DeepPavlov copied to clipboard
GloveEmbedder fails when no header row present in embeddings file
Want to contribute to DeepPavlov? Please read the contributing guideline first.
Please enter all the information below, otherwise your issue may be closed without a warning.
DeepPavlov version (you can look it up by running pip show deeppavlov
): 0.13
Python version: 3.8
Operating system (ubuntu linux, windows, ...): ubuntu 18.04
Issue: embeddings stored in .txt could have no header row w. vocab size and embedding width. in such cases the glove embedder fails to read vectors from file.
Steps to reproduce: try to read e.g. the glove_ru_300_wordpunct file (curl -s -XGET files.deeppavlov.ai/embeddings/glove_300_ru_wiki_lenta_nltk_wordpunct_tokenize/glove_300_ru_wiki_lenta_nltk_wordpunct_tokenize.txt | head -n 3 | cut -d' ' -f-4
to preview)