Panos Kanavos
Panos Kanavos
Hi @howardyclo , If I understand correctly your needs, you can still do that with hooks, as @jsenellart suggested -- instead of training an OpenNMT model for this you just...
Hello, Unfortunately this change breaks completely the `extractAll()` method.
Sure, here it is: ``` zipper::Unzipper unzipper("my_zipfile","my_password"); if (!unzipper.extract("destination_dir")) //
@Lecrapouille , thanks for testing. I'm using zip files created with external tools (7zip), so I can extract them just fine with the tools. I also tried without a password...
I'm also trying to make sentencepiece work with `case_markup`. I got it working somehow by adding the Tokenizer's case placeholders as `user_defined_symbols` in sentencepiece. I still get a few ``s...
OK, this actually seems to work. I lowercased, created a sentencepiece model and vocab with `onmt-build-vocab` with the case placeholders as `user_defined_symbols` and trained with the raw training files a...
> Actually this is not possible with `onmt-build-vocab` from OpenNMT-tf. It always applies a `none` tokenization before training the SentencePiece model. Looks like we need to add some errors when...
That's fantastic, thanks!
I thought I should leave some feedback on this: - I get lots of `unk`s, all of them after punctuation marks (parenthesis, quotes, etc). I inspected a bit and I...
Yes, the vocab is built with `onmt-build-vocab`. I just noticed the related PR in OpenNMT-tf repo, thanks!