python-samplerate icon indicating copy to clipboard operation
python-samplerate copied to clipboard

how to implement python-samplerate for live microphone

Open saibharani opened this issue 7 years ago • 8 comments

I want to convert sample rate of live microphone from pyaudio at 44100hz to 16000hz and use it with pocketsphinx can anyone please help to convert samplerate of live recording chunk by chunk with python-samplerate. Thank you

saibharani avatar Apr 19 '17 17:04 saibharani

There is an example with live resampling of synthetic data using the callback API here. The code would be quite similar using audio input and sounddevice.InputStream. If you already get the live audio with pyaudio, you could use the full API instead and call the process method repeatedly.

tuxu avatar Apr 20 '17 07:04 tuxu

when i am trying to do it as you said using full api it says cannot convert from string to float it is expecting input to be numpy data but my audio is pyaudio which is in pyaudio.pyInt16 format can you please help i want to resample audio for pocketsphinx and i am new to python. so please help me in a little detail way. Thank you.

saibharani avatar Apr 20 '17 09:04 saibharani

You need to convert the input data to a NumPy array first. This can be easily done with np.fromstring, see here for an example.

tuxu avatar Apr 20 '17 10:04 tuxu

sorry for troubling you again I tried using np.fromstring but it is not working It doesn't detect anything. Can you please make a program to record audio using pyaudio in pyInt16 format and convert it from 44100hz to 16000hz using python-samplerate so that I can use it in my college project using pocketsphinx. Please help me solve this issue. Thank you.

saibharani avatar Apr 20 '17 16:04 saibharani

The code below should work as expected.

from __future__ import print_function, division
import numpy as np
import pyaudio
import samplerate as sr

input_rate = 44100
target_rate = 16000
chunk = 1024

audio = pyaudio.PyAudio()
stream = audio.open(format=pyaudio.paInt16, channels=1,
                    rate=input_rate, input=True,
                    frames_per_buffer=chunk)

resampler = sr.Resampler()
ratio = target_rate / input_rate

for i in range(5):
    raw_data = stream.read(chunk)
    data = np.fromstring(raw_data, dtype=np.int16)
    resampled_data = resampler.process(data, ratio)
    print('{} -> {}'.format(len(data), len(resampled_data)))
    # Do something with resampled_data

stream.stop_stream()
stream.close()
audio.terminate()

tuxu avatar Apr 20 '17 22:04 tuxu

Thank you very much for the code i will try it and will tell you if it works for me. Thanks again.

saibharani avatar Apr 21 '17 01:04 saibharani

If resampled_data is all zeros, then likely also data is. This would hint at a problem with the input recorded by pyaudio. Please check. The above code works as expected in Python 2.7 and 3.5 on MacOS 10.12.

Edit: Seems like the original comment is gone.

On 21.04.2017 at 18:22, Y V Bharani sai [email protected] wrote:

hey, I tried your code but when I gave print(resampled_data) it prints an array of 0. without any value even if i am continuously speaking the value doesn't change from 0. and I dont get a text output from pocketsphinx also please solve this for me. Thank you.

tuxu avatar Apr 22 '17 07:04 tuxu

The above script seems to return a numpy array of floats although the input np array contained int16.

It is not a bug, but something to keep in mind.

besimali avatar Aug 12 '20 19:08 besimali