DeepSpeech icon indicating copy to clipboard operation
DeepSpeech copied to clipboard

Provide command line parameters for sample skipping and ordering

Open tilmankamp opened this issue 7 years ago • 1 comments

Command line parameters for sample skipping allows for better bisecting of faulty samples in new corpora. Changing the ordering helps in determining maximum batch size.

tilmankamp avatar Jul 12 '17 16:07 tilmankamp

Would it be an idea to have just the more general option to not sort csv's and use them as ordered, then you can manually use any order you like with just one extra option. I just needed to use this to kind of bisect an issue around (probably) cudnn versions blowing up on certain samples.

Can make up a patch for this and send a pull request, but the naming for such an option doesn't seem to be very straight forward. Something like: --[no]sample_sort, reorder train, dev and test samples by wav_filesize (default: 'true') ?

applied-machinelearning avatar Jul 10 '20 15:07 applied-machinelearning