speech_recognition icon indicating copy to clipboard operation
speech_recognition copied to clipboard

Custom Audio Source

Open parkerbjur opened this issue 1 year ago • 3 comments

I am trying to make a custom audio source class that will accept any Raw Audio. I have a stream class in my source class that can be written to by things outside of the class and then a read function for the recognizer to use. The problem is my buffer keeps running out. How do the other audio sources avoid this problem as it seems the Microphone Source has an unlimited amount of data available? Or at least whenever read is called on Microphone Source something is returned. Here is the implementation of my RawStream. Any advice would be appreciated.

class RawStream(object):
        def __init__(self, sample_width=2):
            self.sample_width = sample_width
            self.sampleDataType = sampleDataTypes[sample_width]
            self.stream = np.array([], dtype=self.sampleDataType)

        # read size bytes from the beginning of stream if possible, or return None otherwise
        def read(self, size):
            data = self.stream[:size]
            self.stream = self.stream[size:]
            return data

        def write(self, data):
            data = np.frombuffer(data, dtype=self.sampleDataType)
            self.stream = np.concatenate((self.stream, data))

parkerbjur avatar Jun 29 '23 01:06 parkerbjur

Hey, I was wondering if you have found a solution to this. Currently, I'm looking for a solution to use the speech_recognition package, with my source being a ROS2 /audio topic that I would like to convert to text.

If you have any pointers for me, it would also be greatly appreciated.

Spricmic avatar Mar 18 '24 15:03 Spricmic

The implementation of RawStream object is not thread-safe, if a write operation happens during read operation (or vice versa), the buffer get corrupted. I'd suggest just use io.BytesIO as stream rather than dealing the whole io thing yourself.

tychuang1211 avatar Jun 04 '24 12:06 tychuang1211

Did this ever end up working?

sswetonicdino avatar Jul 16 '24 21:07 sswetonicdino