Guidance on how to implement subword sampling at train time

Open sooheon opened this issue 7 years ago • 2 comments

I guess I should be re-sampling tokenizations on the train data with SP before each epoch, but it would be nice to see a canonical implementation of this in $FRAMEWORK.

Jun 14 '18 09:06 sooheon

will do.

Jun 16 '18 08:06 taku910

Any update on this ?

Sep 28 '18 08:09 diegoantognini