python-sounddevice
python-sounddevice copied to clipboard
Controlling a sound delivery onset through the sd.OutputStream object for human research
Hello @mgeier! First of all, thank you for the continuous development and maintenance of sounddevice
, it's amazing :smiley:
Short version
How could I further improve the callback
, argument settings of OutputStream
, ... to precisely start delivering an auditory stimuli at a given time for a sounddevice
backend used to deliver auditory stimuli in human neuroscience research? Code at the end.
Long version
You might know of Psychtoolbox
, PsychoPy
, tools to design psychological, behavioral, neurosciences experiments. I work primarily with the later, where delivering an auditory stimuli is very often used to elicit a neurological or behavioral response in a participant. Historically, Psychtoolbox
offers an excellent interface to portaudio
with which I can deliver an audio stimuli simultaneously with a trigger (an event to mark the audio onset, usually delivered via parallel port) with excellent precision (delay between the event and the audio onset measured at < 1 ms).
In python, this interface is accessible through PsychoPy
and its SoundPTB object. PsychoPy
also offers other backends, including SoundDeviceSound. Sadly, the other backends do not match the performance of SoundPTB
(delay between event marking the sound onset and the actual sound onset non zero and variable). On top of that, PsychoPy
development tends to break regularly the most basic features, including sound delivery with their latest 2024.2.0
and 2024.2.1
versions.
Bringing me to today's affairs, I gave a shot in implementing a python audio delivery backend matching SoundPTB
performance using sounddevice
. The objective is to package this backend independently from PsychoPy
to avoid breakage and replace the existing PsychoPy
SoundDeviceSound object with a wrapper around this new backend.
The key element which makes the delivery precise in SoundPTB
is a scheduling mechanism. Typically, this is how a
sound and a trigger (event) would be delivered:
import psychtoolbox as ptb
from byte_triggers import ParallelPortTrigger
from psychopy.sound.backend_ptb import SoundPTB
# create a 440 Hz sinusoid lasting 200 ms
sound = SoundPTB(value=440.0, secs=0.2)
# create the parallel port trigger object
trigger = ParallelPortTrigger("arduino")
# schedule the sound in 200 ms, wait, deliver the trigger
now = ptb.GetSecs() # psychtoolbox uses a separate clock
sound.play(when=now + 0.2)
sleep(0.2) # usually not the default 'time.sleep' function which is too imprecise
trigger.signal(1) # mark the onset
I tried replicating this scheduling idea with sounddevice
, compensating for delays with the time.outputBufferDacTime
and time.currentTime
value obtained in the callback function, and I got very close. This callback approach compared to a blocking stream.write
brings down the delay between the event and the audio onset to ~3 ms ± 1 ms. It is visibly slightly worst than the SoundPTB
object, especially with an higher variability.
I'm a bit out of idea on how I could continue to improve this tentative and would love to hear your input on how I could improve the MWE below, how I could set the different OutputStream
arguments. Without further due, here is the current working state yielding ~3 ms ± 1 ms
.
import time
import numpy as np
import sounddevice as sd
from byte_triggers import ParallelPortTrigger
class Clock:
def __init__(self) -> None:
self._t0 = time.monotonic_ns()
def get_time_ns(self) -> int:
return time.monotonic_ns() - self._t0
class SoundSD:
def __init__(self, data: np.ndarray, device: int, sample_rate: int) -> None:
# store data, device and callback variables
self._data = data if data.ndim == 2 else data[:, np.newaxis]
self._clock = Clock()
self._current_frame = 0
self._target_time = None
# create and open the output stream
self._stream = sd.OutputStream(
blocksize=0, # 0, 4, 64, 128 -> no visible differences
callback=self._callback,
channels=data.shape[1] if data.ndim == 2 else 1,
device=device,
dtype=data.dtype,
latency="low",
samplerate=sample_rate,
)
self._stream.start()
def _callback(self, outdata, frames, time_info, status) -> None:
"""Callback audio function.""" # noqa: D401
if self._target_time is None:
outdata.fill(0)
return
delta_ns = int((time_info.outputBufferDacTime - time_info.currentTime) * 1e9)
if self._clock.get_time_ns() + delta_ns < self._target_time:
outdata.fill(0)
return
end = self._current_frame + frames
if end <= self._data.shape[0]:
outdata[:frames, :] = self._data[self._current_frame : end, :]
self._current_frame += frames
else:
data = self._data[self._current_frame :, :]
data = np.vstack(
(
data,
np.zeros((frames - data.shape[0], data.shape[1]), dtype=data.dtype),
)
)
outdata[:frames, :] = data
# reset
self._current_frame = 0
self._target_time = None
def play(self, when: float | None = None) -> None:
"""Play the audio data.
Parameters
----------
when : float | None
The relative time in seconds when to start playing the audio data. For
instance, ``0.2`` wil start playing in 200 ms. If ``None``, the audio data
is played as soon as possible.
"""
self._target_time = (
self._clock.get_time_ns()
if when is None
else self._clock.get_time_ns() + int(when * 1e9)
)
if __name__ == "__main__":
trigger = ParallelPortTrigger("arduino")
device = sd.query_devices(sd.default.device["output"])
sample_rate = int(device["default_samplerate"])
duration = 0.2
frequency = 440
times = np.linspace(0, duration, int(duration * sample_rate), endpoint=False)
data = np.sin(2 * np.pi * frequency * times)
data /= np.max(np.abs(data)) # normalize
data = data.astype(np.float32) * 0.1
sound = SoundSD(data, device["index"], sample_rate)
sound.play(when=0.5)
time.sleep(0.5) # to be replace with an higher precision sleep function
trigger.signal(1)
time.sleep(0.3)
Note, I measure the delay between the sound onset and the trigger through an EEG amplifier at 1 kHz. I can increase the sampling rate to 16 kHz if needed.