Jaume Zaragoza
Jaume Zaragoza
Oh I see. So I guess the issue can be closed or just kept in the backlog if we need to explore things that reduce the gap for a certain...
Maybe we can check how different is the amount of sentences between the marian log message "Shuffled 24.432.432 sentences" and the corpus files because they might be already omitted at...
I'm running experiments to see if single-side (target) deduplication improves quality. I could run this pair to see if it improves.
Well, for me the value in diceware is that there are a lot of lists for other languages. Catalan list is pretty decent. I also noticed another thing when you...