TensorFlowASR icon indicating copy to clipboard operation
TensorFlowASR copied to clipboard

Inference Error when using a stream_transducer model

Open jun-danieloh opened this issue 4 years ago • 1 comments

Hi @usimarit

I noticed that there are several posts related to my issue like https://github.com/TensorSpeech/TensorFlowASR/issues/168 but I didn't find any clear solutions yet. Do you have any workarounds for this issue?

My Env: Python: 3.6.9 TF version: 2.5.2

My command: root@elf-ml-v100:/home/azureuser/cloudfiles/code/TensorFlowASR# python examples/demonstration/streaming_tflite_conformer.py /home/azureuser/cloudfiles/code/TensorFlowASR/examples/demonstration/wavs/1089-134691-0000.flac --tflite /home/azureuser/cloudfiles/code/TensorFlowASR/pretrained_conformer_librispeech/conformer_tflite/conformer-tflite/subword-conformer.latest.tflite

I used pre-trained model(conformer-tflite.zip) from https://drive.google.com/drive/folders/14B89uLIPQjdHZ7JxxL3mfYgpPPsbY8Oj

Error logs:

2021-11-16 10:27:31.995245: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
PortAudioError: Error querying device -1
INFO: Created TensorFlow Lite delegate for select TF ops.
2021-11-16 10:27:34.213851: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-16 10:27:34.216043: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-11-16 10:27:34.328022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0001:00:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-11-16 10:27:34.328087: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-11-16 10:27:34.335059: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-11-16 10:27:34.335123: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-11-16 10:27:34.336396: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-11-16 10:27:34.337344: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-11-16 10:27:34.338589: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-11-16 10:27:34.339805: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-11-16 10:27:34.340009: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-11-16 10:27:34.341758: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-11-16 10:27:34.341799: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-11-16 10:27:35.251922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-16 10:27:35.251972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-11-16 10:27:35.251984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-11-16 10:27:35.254721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14645 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0001:00:00.0, compute capability: 7.0)
INFO: TfLiteFlexDelegate delegate: 212 nodes delegated out of 2679 nodes with 130 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 26 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 1 nodes delegated out of 40 nodes with 1 partitions.

2021-11-16 10:27:35.281768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0001:00:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-11-16 10:27:35.283422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-11-16 10:27:35.284948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0001:00:00.0 name: Tesla V100-PCIE-16GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-11-16 10:27:35.286471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-11-16 10:27:35.286502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-16 10:27:35.286516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0
2021-11-16 10:27:35.286523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N
2021-11-16 10:27:35.288079: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14645 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0001:00:00.0, compute capability: 7.0)
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/managers.py", line 749, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "examples/demonstration/streaming_tflite_conformer.py", line 113, in recognizer
    data = Q.get()
  File "<string>", line 2, in get
  File "/usr/lib/python3.6/multiprocessing/managers.py", line 753, in _callmethod
    self._connect()
  File "/usr/lib/python3.6/multiprocessing/managers.py", line 740, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 487, in Client
    c = SocketClient(address)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "examples/demonstration/streaming_tflite_conformer.py", line 175, in <module>
    send_process.close()
AttributeError: 'Process' object has no attribute 'close'

jun-danieloh avatar Nov 16 '21 10:11 jun-danieloh

@jun-danieloh We are working on streaming asr (conformer and rnn) and we will have a code for demonstration. Meanwhile, the conformer only used for non-streaming (prediction runs after finishing speech). So the streaming_tflite_conformer.py has some issue now, but if you want to test, I think the solutions are like one of the following:

  1. Load tflite model to memory -> split signal into chunks -> call model every chunk of signal and update the states of model
  2. Load tflite model using tf-serving -> split signal into chunks -> call rpc apis of tfserve every chunk of signal

The second approach is for server-side recognition which requires you to know how to deploy model with tf-serving. The demo python file is written using approach 1. But it uses multiprocessing for recognition, one process for the model and second process for splitting signal into chunks. So you can change the file to use single process with the model already loaded into memory and call the model "for each chunks of signal".

nglehuy avatar Dec 05 '21 07:12 nglehuy

I’ll close the issue here due to inactivity. Feel free to reopen if you have further questions.

nglehuy avatar Sep 02 '22 05:09 nglehuy