chinese-char-lm
chinese-char-lm copied to clipboard
training restore
as currently implemented, training mid-epoch interrupt and restore effectively over-weights earlier samples in the training set.
solution 1:
- shard the training set into smaller files
- shuffle the filename queue
solution 2:
- record and checkpoint the line number into a monolithic training set file
- skip to that line when restoring the training