Pengzhi Gao

Results 13 comments of Pengzhi Gao

Sorry for the late reply. We will take a look at this issue.

Sorry for the late reply. We will take a look at it.

This task includes the `ELMoEmbedder` adapted from `allennlp` and the corresponding `ELMoTokenizer` adapted from `allennlp` and [MosesTokenizer](https://github.com/alvations/sacremoses).

For the tokenizer part, see: https://github.com/allenai/allennlp/issues/1933

See Zecong's comments in #208

Yes. I think we should support this feature. Since `pre-trained` tokenizers already take care of the corresponding vocabulary files and the special tokens, it is unnecessary to require vocabulary file...

We deleted `sentencepiece` vocab file because `sentencepiece` mode file is purely self-contained, and vocab file is never used in the tokenizer. To the best of my knowledge, the vocab file...

Could you write down how you integrate `tokenizer` with `pairedtextdata`? There is another related issue #256 I think we should provide the interface to use `tokenizer` instead of `vocab`. Do...

Please merge `master` into your branch to pass the `codecov` test.