open_lm
open_lm copied to clipboard
Support user specified token pre-processing functions
Often we may have special control tokens that need to be handle when creating the inputs and targets. To allow max flexibility, users should be able to provide their own sample_chunk functions or similar