alvations

Results 153 comments of alvations

This PR is so much needed, esp. with #150

@ankit--agrawal "attention", nice pun =)

Yes, type-hints are nice. If you would like to, pick a module or a couple of files that you'll like to add type-hints and create a PR. Someone will review...

Maybe you can try `*_score.py` files from https://github.com/nltk/nltk/tree/develop/nltk/translate. They are quite isolated and self-contained. Otherwise, a really useful place to have type-hints would be any files/functions/classes in https://github.com/nltk/nltk/tree/develop/nltk/tag

Thanks @f0lie , the type-hints looks good as a start! Our sincere thanks to the people who's helping at PyCon too. I've not tried personally tried mypy, but I guess...

@mmmm1998 Thank you for raising the issue. The patch there for `ru-rnc-new` was there to hot-plug in the new mappings without messing with the existing data in `nltk_data`. I also...

The `chomsky_normal_form()` in NLTK is a tree-binarization function. I think it can't be directly applied to grammars, see https://github.com/nltk/nltk/blob/develop/nltk/treetransforms.py Grammar transformation to CNF is rather complex and hasn't yet been...

I think something like there's more things that `\s` represent. To capture a single space, it would be `[^\S\t\n\r\f\v]` or simply space ` ` =) We could simply use `(?:\.(?:...

If we make the following changes to `word_tokenize` at https://github.com/nltk/nltk/blob/develop/nltk/tokenize/__init__.py, it would achieve similar behavior as of Stanford CoreNLP: ```python import re from nltk.tokenize.treebank import TreebankWordTokenizer # Standard word tokenizer....

This issue is on the opening quotes and the clitic fix for that can be easily done and that'll make the `word_tokenize` behave like Stanford's. IMHO, I think it's a...