Mišo Belica comments

Results 74 comments of


                                            Mišo Belica

Support for list of pre-generated stems/lemmas

Well, I would avoid changing sumy unless it is really needed. You can rather implement your own tokenizer like this: ```py class Tokenizer: language = 'en???' def to_sentences(self, paragraph): return...

Tokenizer/Stemmer and few other questions

Hi Vladimir, I think you know the code more than me because [TextRank was not contributed](https://github.com/miso-belica/sumy/pull/100) by me. At least not the current implementation. But I will try to check...

Tokenizer/Stemmer and few other questions

1 - It's not completely true. Sumy uses `nltk.word_tokenize` and the regex is used only to filter some words out. You are right it should not filter some words with...

ZeroDivisionError Rouge-L summary level

Hi @dorianve, can you attach some simple test to reproduce this? Or maybe create a PR with the test and a fix? You can't update the repository, but you are...

Heading in Chinese

@seven-linglx Can you share your solution with us? Can you add here the code snippet?

Heading in Chinese

Thank you all. I think this is more tricky. I tried to [find out some solution](https://chinese.stackexchange.com/questions/10753/capitalization-in-chinese) but seems I should introduce a new parser. Maybe `MarkdownParser` and let `PlaintextParser` really...

Mišo Belica

Support for list of pre-generated stems/lemmas

Tokenizer/Stemmer and few other questions

Tokenizer/Stemmer and few other questions

ZeroDivisionError Rouge-L summary level

Heading in Chinese

Heading in Chinese

LexRank Performance when corpus is large

ROUGE-L (Summary Level) how it works ?

A question about computing term frequency in Lexrank

Edmundson summarizer