OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

[Enhancement] New features about text augmentation

Open Nisekoixmy opened this issue 1 year ago • 2 comments

Hello,

I found it would be good if we can have the preview of text augmentation just like the image augmentation. And there are also some potential augmentations for captions:

  • Randomly dropping a caption chunk by a given probability. Need to have a list of strings to exclude caption chunks that user don't want to be dropped.

I also have a question about the shuffle of dataset. Does msg dataloader will shuffle the order of training data in each epoch? I think it should be normal to always shuffle the dataset.

Many thanks!

Nisekoixmy avatar Dec 01 '23 11:12 Nisekoixmy