rnnt-speech-recognition
rnnt-speech-recognition copied to clipboard
Inference is giving ValueError: When input_signature is provided, all inputs to the Python function must be convertible to tensors:
Hi,
I started training the model using the entire Common Voice dataset given in the github page.
I'm using tensorflow 2.2.0 with python 3.6. The training command used
python run_rnnt.py --mode train --data_dir data_trail/preprocessed --batch_size 8 --eval_size 100
using 1080Ti single GPU. I got OOM error after about 18k steps (still in Epoch 0) and my loss was about 116.7. The Accuracy graph in tensorboard is showing about 0.42.
Since a checkpoint is getting saved for every 1000 steps, I tried to run evaluation:
python transcribe_file.py --checkpoint model/checkpoint_15000_109.9516.hdf5 --i data_trail/clips/common_voice_en_19945797.wav
But I'm getting the following error:
2020-06-12 11:19:38.910255: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-12 11:19:38.929092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-06-12 11:19:38.929267: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2020-06-12 11:19:38.930665: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-12 11:19:38.931896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-12 11:19:38.932090: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-12 11:19:38.933532: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-12 11:19:38.934265: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-12 11:19:38.937197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-12 11:19:38.938298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-06-12 11:19:38.938559: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
2020-06-12 11:19:38.943923: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3398040000 Hz
2020-06-12 11:19:38.944538: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4f70350 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-12 11:19:38.944555: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-06-12 11:19:39.013200: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2bfda90 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-06-12 11:19:39.013246: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-06-12 11:19:39.014703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-06-12 11:19:39.014784: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2020-06-12 11:19:39.014824: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-12 11:19:39.014860: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-12 11:19:39.014896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-12 11:19:39.014931: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-12 11:19:39.014962: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-12 11:19:39.014991: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-12 11:19:39.017390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-06-12 11:19:39.017452: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
2020-06-12 11:19:39.020326: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-12 11:19:39.020350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-06-12 11:19:39.020361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-06-12 11:19:39.022881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9907 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-06-12 11:19:41.880417: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-12 11:19:41.984154: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
Traceback (most recent call last):
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2293, in _convert_inputs_to_signature
value, dtype_hint=spec.dtype)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1341, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 321, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 262, in constant
allow_broadcast=True)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 270, in _constant_impl
t = convert_to_eager_tensor(value, ctx, dtype)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 96, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "transcribe_file.py", line 59, in <module>
main(args)
File "transcribe_file.py", line 38, in main
decoded = decoder_fn(log_melspec)[0]
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 648, in _call
*args, **kwds)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2238, in canonicalize_function_inputs
self._flat_input_signature)
File "/home/tumu/Self/Research/Work/tensorflow_work/tensorflow_2.2_env/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2299, in _convert_inputs_to_signature
format_error_message(inputs, input_signature))
ValueError: When input_signature is provided, all inputs to the Python function must be convertible to tensors:
inputs: (
tf.Tensor(
[[[ -9.891962 -10.041118 -10.170887 ... -2.5574753 -3.1098373
-2.8594036 ]
[ -4.2638397 -3.8721824 -3.818324 ... -1.2381899 -1.4718239
-1.2757974 ]
[ -3.8065548 -3.9217172 -3.9833403 ... -2.5127609 -2.4093955
-1.8164482 ]
...
[ 0.26996142 0.24929267 0.10105902 ... -1.764302 -1.2930858
-1.6539826 ]
[ -1.3995155 -1.8580544 -2.5036726 ... -1.9249303 -2.1395605
-1.7865329 ]
[ -2.521644 -2.1898646 -2.1456 ... -2.134868 -2.5040653
-2.1412349 ]]], shape=(1, 166, 240), dtype=float32),
None)
input_signature: (
TensorSpec(shape=(None, None, 240), dtype=tf.float32, name=None),
TensorSpec(shape=(), dtype=tf.int32, name=None))
Is this because here (hparams is not a tensor but a json) ?
Have you fixed the issue anyway?
Hi,
No didn't fix this issue. Not sure how to fix this
@tumusudheer What did the word-error-rate look like during your training?
Mine does not look very promising:
Hi,
No didn't fix this issue. Not sure how to fix this
Finally, the issue is fixed following the 2nd solution in the issue of tensorflow repo.
iterator = iter(train_dataset)
@tf.function(input_signature=[iterator.element_spec])
def train_step(dataset_inputs):
def step_fn(inputs):
# ...
for batch, inputs in enumerate(train_dataset):
loss, metrics_results = train_step(next(iterator))
Hi, I met the same problem and tried to fix it according to the 2nd solution in https://github.com/tensorflow/tensorflow/issues/29911#issuecomment-505688141
but it didn't work the bug is here.
Traceback (most recent call last): File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2293, in _convert_inputs_to_signature value, dtype_hint=spec.dtype) File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1341, in convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 321, in _constant_tensor_conversion_function return constant(v, dtype=dtype, name=name) File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 262, in constant allow_broadcast=True) File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 270, in _constant_impl t = convert_to_eager_tensor(value, ctx, dtype) File "/acoustic_data1/renxiaoming/install_dir/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 96, in convert_to_eager_tensor return ops.EagerTensor(value, ctx.device_name, dtype) ValueError: Attempt to convert a value (PerReplica:{ 0: <tf.Tensor: shape=(1, 310, 240), dtype=float32, numpy= array([[[-8.555949 , -8.693979 , -8.79496 , ..., -1.2911978,...
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_rnnt.py", line 598, in