Mikel L. Forcada
Mikel L. Forcada
Thanks a million, ato!
Completely agree. lttoolbox finite-state processors can deal with tags anywhere as far as I know. Which modules in particular create trouble? Transfer, I assume, but not sure.
I can't get the command-line detokenizer to work properly. I have tried this: ``` $ echo "L'amitié nous a fait forts d'esprit" | sacremoses tokenize -l fr | sacremoses detokenize...
Wow, that was fast. Yes, apostrophes don't look good when detokenized (they are separated with spaces).
I'll be grateful if you let me know of any progress.
(1) Have you had a chance to solve the problem with spaces when detokenizing, @alvations ? (2) Also, apparently, there is a way to specify the language when creating the...
Thanks a million, @alvations ! I updated. Sacremoses says now it is 0.0.19. Detokenization for French works as a breeze now! Cheers! Catalan rules for apostrophes and hyphens with pronouns,...
Aggh, the last one is wrong. It should be ``` VERB-(me|te|se|li|)'(m|t|s|l|ns|ls) → VERB -(me|te|se|li) '(m|t|s|l|ns|ls) ``` Sorry about that!