io icon indicating copy to clipboard operation
io copied to clipboard

tfio.audio.AudioIOTensor is not thread safe

Open provos opened this issue 2 years ago • 1 comments

tfio.audio.AudioIOTensor - likely the underlying audio libraries - are not thread safe.

Here is a simple example:

def read_audio_file(filename: str, dtype: str):
    audio = tfio.audio.AudioIOTensor(filename, dtype=dtype)
    return audio._resource, audio.dtype, audio.rate

from pathlib import Path

filesnames = Path('somedirectory').glob('*.wav')
filenames = [str(x) for x in filenames]
ds = tf.data.Dataset.from_tensor_slices(filenames)
ds = ds.map(lambda x: read_audio_file(x, dtype='int16'), num_parallel_calls=4)
# only include samples that match our desired sample rate
ds = ds.filter(lambda x,y,z: z==44100)

Assuming the directory has files with different sample rates, this is the quickest way to show the problem


assert len(list(ds)) == '#44.1k files in that direcotry'

Every invocation of this code will likely lead to a different number of files. I don't understand the underlying code well enough but I suspect that the external libraries for reading wav files may have thread safety issues.

provos avatar Jan 29 '23 21:01 provos

Any update on this?

Mddct avatar Sep 12 '23 02:09 Mddct