gsoc2018-spacy
gsoc2018-spacy copied to clipboard
Sentence splitter not working properly affecting part of speech tagger
Problem
I tried to run the sentence splitter submodule (sentence_splitter.py) but it didn't work in Greek language for me. I tried loading both el_core_news_sm and el_core_news_md and also tried inserting and encoding text in unicode utf-8. However it does not recognize different sentences but sees them as one. At the same time this affects the part of speech tagger. Do you have any idea what might the problem be?
Thanks in advance.
Environment
spaCy version: 2.1.4
Location: /home/dimitris/.local/lib/python3.6/site-packages/spacy
Platform: Linux-4.18.0-17-generic-x86_64-with-Ubuntu-18.04-bionic
Python version: 3.6.7
Models: el, en
Hello, thanks for reporting this!
Could you please tell from where did you download the models?
Are you using spacy-nightly
?
No, I downloaded the models from https://spacy.io/models/el . Should I use spacy-nightly
?
Could you try uninstall spacy, install spacy-nightly, download the models through nightly and then check again? Sorry for the trouble, I need to check if it is a version problem.
I tried but I faced the same problem .
In order to install models through spacy-nightly I used: python3 -m spacy install el_core_news_md
.
Is that correct? Any other suggestion on something that I may did wrong?
I am facing the same problem after trying both.