Mark Franey

Results 3 issues of Mark Franey

It appears that the pipeline only supports training a joint BPE model, but it is sometimes better to have separate source/target BPE vocabularies

enhancement

Epsilon sampling is a compelling alternative/complement to top_p and top_k sampling and would make a good addition to CTranslate2: https://arxiv.org/abs/2305.09860

enhancement

The wiki suggests a batch size of 128 is recommended for 'stable training'. It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger...

enhancement