ner
ner copied to clipboard
Error on sentences without "normal" tokens
When you pass an empty string or string without normal tokens (for example, ascii emoticons), exception is raised:
>>> import ner
>>> extractor = ner.Extractor()
>>> list(extractor(':)')) # works fine — empty list is returned
>>> list(extractor('')) # fails
>>> list(extractor('|*!*|')) # fails
Maybe these cases should return an empty generator too? Trace:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/universome/pyvenvs/zoo/lib/python3.6/site-packages/ner/extractor/extractor.py", line 54, in __call__
tags = self.network.predict_for_token_batch([tokens_lemmas])[0]
File "/home/universome/pyvenvs/zoo/lib/python3.6/site-packages/ner/network.py", line 379, in predict_for_token_batch
batch_x, _ = self.corpus.tokens_batch_to_numpy_batch(tokens_batch)
File "/home/universome/pyvenvs/zoo/lib/python3.6/site-packages/ner/corpus.py", line 205, in tokens_batch_to_numpy_batch
max_token_len = max([len(token) for utt in batch_x for token in utt])
ValueError: max() arg is an empty sequence