sentencepiece icon indicating copy to clipboard operation
sentencepiece copied to clipboard

Guidance on how to implement subword sampling at train time

Open sooheon opened this issue 7 years ago • 2 comments

I guess I should be re-sampling tokenizations on the train data with SP before each epoch, but it would be nice to see a canonical implementation of this in $FRAMEWORK.

sooheon avatar Jun 14 '18 09:06 sooheon

will do.

taku910 avatar Jun 16 '18 08:06 taku910

Any update on this ?

diegoantognini avatar Sep 28 '18 08:09 diegoantognini