Automatic_Speech_Recognition
Automatic_Speech_Recognition copied to clipboard
deeepSpeech2 model ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D') with input shapes:
Traceback (most recent call last):
File "main/timit_train.py", line 255, in
need to preprocess data?
@witkeyshare hi,thank you, I will fix it .
I got the same error. ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D') with input shapes: [757,100,39], [41,11,1,32].
The probelm is that before calling build_deepSpeech2(), there should be a reshape function called. And indeed, the author wrote the reshape line, which is in deepspeech2.py line 132:
- self.inputXrs = tf.reshape(self.inputX, [args.batch_size, args.num_feature, maxTimeSteps, 1])
So the input of build_deepSpeech2() should be self.inputXrs, instead of self.Input in line 151 deepspeech2.py
- output_fc = build_deepSpeech2(self.args, maxTimeSteps, self.inputXrs, self.cell_fn, self.seqLengths)
change these two lines like before, this problem should be solved. @Entonytang
But in the deepspeech2 model, the transition from conv layer to rnn layer is not working, which is in this issue: https://github.com/zzw922cn/Automatic_Speech_Recognition/issues/49. Can anyone help with this?