YouTokenToMe issues

Results 42 YouTokenToMe issues

Sort by recently updated

No word tokenizer under the hood?

Hi, In the original BPE paper, as well as in the BPE dropout paper, the authors apply word-based tokenization (namely, the Moses tokenizer, as well as some others) before the...

slowwavesleep

Error during installation

Hi! I'm trying to install this module using the following code: ``` pip install youtokentome ``` ...but I got an error saying that the specified path does not exist. ```...

ruruu127

How does YouTokenToMe's speed compare to subword-nmt?

Hello, I took a look at [benchmarks page](https://github.com/VKCOM/YouTokenToMe/blob/master/benchmark.md) I wanted to know how YouTokenToMe's speed compares to [subword-nmt](https://github.com/rsennrich/subword-nmt) and is there a reason why it was left out of the...

gowtham1997

Add an option to predefine special tokens

I want to use this yttm model. However, I want to add [MASK] token to the vocabulary. In this case, How can I predefine special tokens?

Kyeongpil

enhancement

Using YouTokenToMe with pre-defined vocab and embeddings

I want to use YouTokenToMe for fast id encoding, but I need to do it with embeddings taken from here : https://nlp.h-its.org/bpemb/ Obviously, there is a pre-defined vocab there. Right...

alexbalandi

Is it possible to unset random seed for BPE-dropout?

In `YouTokeToMe` BPE-dropout is always the same for the same input. That contradicts the idea described in the paper: ``` During segmentation, at each merge step some merges are randomly...

skurzhanskyi

enhancement

YouTokenToMe
YouTokenToMe copied to clipboard

Metadata

No word tokenizer under the hood?

Error during installation

How does YouTokenToMe's speed compare to subword-nmt?

Add an option to predefine special tokens

Using YouTokenToMe with pre-defined vocab and embeddings

Is it possible to unset random seed for BPE-dropout?

Controlling word tokenization

How to train with multiple corpus files?

how to get vocab

Process killed?

← Metadata

Owner

Metadata

YouTokenToMe YouTokenToMe copied to clipboard

Metadata

← Metadata

Owner

Metadata

YouTokenToMe
YouTokenToMe copied to clipboard