kiui
kiui
No... it should take less than 10 hours to finish all epochs on V100. Are you facing this slow speed problem when training other DL models (e.g., resnet)?
The current training speed doesn't have much space to improve I guess. Maybe you could increase the `num_rays` and train less steps, but this may scarifice performance. Also, you may...
@yediny Could you provide the command you use? If you have enough GPU memory, you could try to use `--preload 2` to see if the speed bottleneck is image loading.
This is reasonable since there are works using spectrum as input, but may need some experiments to verify.
@tylersky1993 Hi, unfortunately the current streaming mode is not performing well, since the ASR model we use is not specifically designed for real-time ASR (it requires at least 1 second...
@tylersky1993 You could just fix `audio_in_dim` to 80 and remove the `if` condition? (assuming your asr_model's name doesn't contain 'esperanto').
You'll have to train from scratch, instead of loading a pretrained model. You could delete the workspace and try again.
You can also use `deepspeech` in testing? Just specify `--asr_model deepspeech` and use the corresponding audio features.
Could you provide the full error log?
It says ` argument --data_range: invalid int value: '{Pose_start}'`, what's the command line you are running?