Linxiao ZENG
Linxiao ZENG
Get random sentence for next sentence prediction task, random sentence should be get in random_file rather than original file to iterate.
When using `case_markup` in `space`/`none` mode, unexpected behavior happens: ```python >>> pyonmttok.Tokenizer("none", case_markup=True).tokenize("你好世界,这是一个Test。") ... (['⦅mrk_case_modifier_C⦆', '你好世界,这是一个test。'], None) >>> pyonmttok.Tokenizer("none", case_markup=True).detokenize(['⦅mrk_case_modifier_C⦆', '你好世界,这是一个test。']) ... '你好世界,这是一个test。' ``` As you can see, `.detokenize` can...
Add BERT into OpenNMT-py.
Hello there, Thanks for your efforts in open-sourcing the code, it's vital for us trying to reproduce the result presented in the paper. ### Problem But I've come across a...
Currently, the following Error might arise when a trained EEND-vector model is used to do inference on an audio record with only a single speaker. ``` ValueError: Found array with...