alvations
alvations
P/S: I'm thinking about how to put this feature in. It's not hard but just have to think a little about the user's usage logic =) I'm a little busy...
Sorry about it, I think it was cause by a mistake in a previous version which was patched in #36 Could you try the latest version `pip install -U sacremoses`?...
Hmmm, seems like the apostrophe for french isn't working as expected though: From original moses: ``` $ echo "L'amitié nous a fait forts d'esprit" | ~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l fr | ~/mosesdecoder/scripts/tokenizer/detokenizer.perl...
@mlforcada Sorry for the delay! Now the latest version should have the french apostrophes patched. ```python from sacremoses.tokenize import MosesTokenizer, MosesDetokenizer mt = MosesTokenizer(lang='fr') md = MosesDetokenizer(lang='fr') md.detokenize(mt.tokenize("L'amitié nous a...
Regarding the Spanish escaping of the ampersand, I'm not able to reproduce it, shouldn't be a problem with version `>=0.0.13`. The latest french patch would be `>=0.0.14` Which version of...
Thanks @mlforcada! Let me see how I could convert the rules above =)
Actually that `sent_tokenize` is a can of worms thus the reluctance to complete the code =) I'm a little pack these couple of days but let me see if I...
This is actually quite trivial in Python and on command line, not sure whether adding a lowercase script would be beneficial. In Python: ```python s = "abc" s.lower() ``` On...
Interesting. Hmmm, so is that feature in the `sacremoses` CLI worth implementing? @noe's point to https://stackoverflow.com/questions/13381746/tr-upper-lower-with-cyrillic-text/13383175#13383175 is right, on Ubuntu ```python $ echo "Έζησε στη Μόσχα." | tr [:upper:] [:lower:]...
@mayhewsw @noe @mjpost No promises but lowercase is a low-hanging fruit. Lets see how far I get go by end of the week of this sprint =) ---- @mjpost good...