python-sounddevice Controlling a sound delivery onset through the sd.OutputStream object for human research

Controlling a sound delivery onset through the sd.OutputStream object for human research

Open mscheltienne opened this issue 5 months ago • 4 comments

Hello @mgeier! First of all, thank you for the continuous development and maintenance of sounddevice, it's amazing :smiley:

Short version

How could I further improve the callback, argument settings of OutputStream, ... to precisely start delivering an auditory stimuli at a given time for a sounddevice backend used to deliver auditory stimuli in human neuroscience research? Code at the end.

Long version

You might know of Psychtoolbox, PsychoPy, tools to design psychological, behavioral, neurosciences experiments. I work primarily with the later, where delivering an auditory stimuli is very often used to elicit a neurological or behavioral response in a participant. Historically, Psychtoolbox offers an excellent interface to portaudio with which I can deliver an audio stimuli simultaneously with a trigger (an event to mark the audio onset, usually delivered via parallel port) with excellent precision (delay between the event and the audio onset measured at < 1 ms).

In python, this interface is accessible through PsychoPy and its SoundPTB object. PsychoPy also offers other backends, including SoundDeviceSound. Sadly, the other backends do not match the performance of SoundPTB (delay between event marking the sound onset and the actual sound onset non zero and variable). On top of that, PsychoPy development tends to break regularly the most basic features, including sound delivery with their latest 2024.2.0 and 2024.2.1 versions.

Bringing me to today's affairs, I gave a shot in implementing a python audio delivery backend matching SoundPTB performance using sounddevice. The objective is to package this backend independently from PsychoPy to avoid breakage and replace the existing PsychoPy SoundDeviceSound object with a wrapper around this new backend.

The key element which makes the delivery precise in SoundPTB is a scheduling mechanism. Typically, this is how a sound and a trigger (event) would be delivered:

import psychtoolbox as ptb
from byte_triggers import ParallelPortTrigger
from psychopy.sound.backend_ptb import SoundPTB

# create a 440 Hz sinusoid lasting 200 ms
sound = SoundPTB(value=440.0, secs=0.2)
# create the parallel port trigger object
trigger = ParallelPortTrigger("arduino")

# schedule the sound in 200 ms, wait, deliver the trigger
now = ptb.GetSecs()  # psychtoolbox uses a separate clock
sound.play(when=now + 0.2)
sleep(0.2)  # usually not the default 'time.sleep' function which is too imprecise
trigger.signal(1)  # mark the onset

I tried replicating this scheduling idea with sounddevice, compensating for delays with the time.outputBufferDacTime and time.currentTime value obtained in the callback function, and I got very close. This callback approach compared to a blocking stream.write brings down the delay between the event and the audio onset to ~3 ms ± 1 ms. It is visibly slightly worst than the SoundPTB object, especially with an higher variability.

I'm a bit out of idea on how I could continue to improve this tentative and would love to hear your input on how I could improve the MWE below, how I could set the different OutputStream arguments. Without further due, here is the current working state yielding ~3 ms ± 1 ms.

import time

import numpy as np
import sounddevice as sd
from byte_triggers import ParallelPortTrigger


class Clock:
    def __init__(self) -> None:
        self._t0 = time.monotonic_ns()

    def get_time_ns(self) -> int:
        return time.monotonic_ns() - self._t0


class SoundSD:
    def __init__(self, data: np.ndarray, device: int, sample_rate: int) -> None:
        # store data, device and callback variables
        self._data = data if data.ndim == 2 else data[:, np.newaxis]
        self._clock = Clock()
        self._current_frame = 0
        self._target_time = None
        # create and open the output stream
        self._stream = sd.OutputStream(
            blocksize=0,  # 0, 4, 64, 128 -> no visible differences
            callback=self._callback,
            channels=data.shape[1] if data.ndim == 2 else 1,
            device=device,
            dtype=data.dtype,
            latency="low",
            samplerate=sample_rate,
        )
        self._stream.start()

    def _callback(self, outdata, frames, time_info, status) -> None:
        """Callback audio function."""  # noqa: D401
        if self._target_time is None:
            outdata.fill(0)
            return
        delta_ns = int((time_info.outputBufferDacTime - time_info.currentTime) * 1e9)
        if self._clock.get_time_ns() + delta_ns < self._target_time:
            outdata.fill(0)
            return
        end = self._current_frame + frames
        if end <= self._data.shape[0]:
            outdata[:frames, :] = self._data[self._current_frame : end, :]
            self._current_frame += frames
        else:
            data = self._data[self._current_frame :, :]
            data = np.vstack(
                (
                    data,
                    np.zeros((frames - data.shape[0], data.shape[1]), dtype=data.dtype),
                )
            )
            outdata[:frames, :] = data
            # reset
            self._current_frame = 0
            self._target_time = None

    def play(self, when: float | None = None) -> None:
        """Play the audio data.

        Parameters
        ----------
        when : float | None
            The relative time in seconds when to start playing the audio data. For
            instance, ``0.2`` wil start playing in 200 ms. If ``None``, the audio data
            is played as soon as possible.
        """
        self._target_time = (
            self._clock.get_time_ns()
            if when is None
            else self._clock.get_time_ns() + int(when * 1e9)
        )


if __name__ == "__main__":
    trigger = ParallelPortTrigger("arduino")
    device = sd.query_devices(sd.default.device["output"])
    sample_rate = int(device["default_samplerate"])
    duration = 0.2
    frequency = 440
    times = np.linspace(0, duration, int(duration * sample_rate), endpoint=False)
    data = np.sin(2 * np.pi * frequency * times)
    data /= np.max(np.abs(data))  # normalize
    data = data.astype(np.float32) * 0.1
    sound = SoundSD(data, device["index"], sample_rate)
    sound.play(when=0.5)
    time.sleep(0.5)  # to be replace with an higher precision sleep function
    trigger.signal(1)
    time.sleep(0.3)

Note, I measure the delay between the sound onset and the trigger through an EEG amplifier at 1 kHz. I can increase the sampling rate to 16 kHz if needed.

Sep 03 '24 10:09 mscheltienne

python-sounddevice python-sounddevice copied to clipboard

Controlling a sound delivery onset through the sd.OutputStream object for human research

python-sounddevice
python-sounddevice copied to clipboard