Adrien Barbaresi

Results 412 comments of Adrien Barbaresi

Thanks, I agree that moving most of the docs into a separate folder would be better. Considering hosting I'm going to have a look at the links you provided. So...

We could use links to the relevant sections.

The way I see it there would be a small readme file in the future, a reduced version of the current one containing links to additional documentation hosted somewhere else,...

I'd be in favor of sphinx because the [autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) function will prove useful to automatically reflect changes made to the functions or classes. I'm open to try mkdocs but I...

OK then I think I'll just merge the PR as it is and try to get familiar with the process.

Thanks for the feedback! The tokenizer does something slightly different than usually expected: it clusters chars together while segmenting the input. Since the output only consists of lemmata the idea...

Yes, it's faster and simpler. Otherwise you would have to tokenize punctuation accurately (which is a different task) and run the lemmatizer on it (which is useless in the current...

Yes, here are some ideas: - we could switch to `mypy --strict` - `flake8` is used to detect obvious mistakes without starting the whole pipeline, do you have another configuration...

Hi @dysby, good catch! My guess would be that the results are cached internally, which affects the results of `text_lemmatizer()`. In any case it is worth looking further into the...