mach icon indicating copy to clipboard operation
mach copied to clipboard

sysaudio: read/write callback design goal

Open emidoots opened this issue 1 year ago • 2 comments

@alichraghi I think we should work towards this API design:

const Recorder/Player = struct {
    /// The number of channels
    ///
    /// This field is initialized after a call to (TODO: device create function) and matches the
    /// number of audio channels reported by the underlying device, but it may not match the number
    /// of channels you requested at creation time if the device did not support that number of
    /// channels.
    channels: u8,

    /// The format of each audio sample
    ///
    /// This field is initialized after a call to (TODO: device create function) and matches the
    /// format reported by the underlying device, but it may not match the format you requested at
    /// creation time if the device did not support that format. 
    format: Format,

    /// Whether the channels' samples are interleaved (`ABABAB`) or planar (`AAABBB`) in memory.
    ///
    /// This field is initialized after a call to (TODO: device create function) and always matches
    /// your requested preference.
    ///
    /// Most native platforms support interleaved audio, but browsers/WebAudio only support planar
    /// audio. If the platform API does not support your preference, sysaudio will automatically
    /// perform conversion for you. This both prevents you from needing to do any conversion
    /// yourself, and also enables sysaudio to handle it per-platform to reduce any unneccessary
    /// conversions.
    interleaved: bool,
};
fn readCallback(ctx: Context, raw_audio: []const u8, recorder: sysaudio.Recorder) void {
    _ = ctx;
    const num_samples = raw_samples.len / recorder.format.size();
    const num_samples_per_channel = num_samples / recorder.channels;
    const format_size = format.size();
    const frames = input.len / format_size;

    // NOTE: sysaudio should expose a clear buffer size that can be used here, 16*1024 should not be
    // hard-coded like this:
    //
    // Also, what guarantees can we make about `raw_audio`? e.g. can we say
    // it has a static length per-platform, or static length for the lifetime of a device? Something
    // like that would be ideal, whatever guarantee we can make.
    var samples: [16 * 1024]f32 = undefined;

    // Convert raw_audio in the device' format to f32 samples:
    sysaudio.convert(f32, samples[0..num_samples])

    // Write f32 samples to disk
    //
    // Note: this is just an example, things like file I/O should not be performed in a callback
    // as any stall here can result in losing samples from a recorder, failing to write enough
    // samples to a player. In a real application you should e.g. do this work in a separate thread
    // and utilize e.g. ring buffers.
    _ = file.write(std.mem.sliceAsBytes(samples[0..num_samples])) catch {};
}
-fn writeCallback(_: ?*anyopaque, output: []u8) void {
+fn writeCallback(ctx: Context, raw_audio_out: []u8, player: sysaudio.Player) void {
    // replace player.write() with sysaudio.convert()

Notes:

  • _: ?*anyopaque parameter is replaced by a typed generic context parameter. The user can decide this type, and ctx: void would be a valid choice. They would need to pass this type into the player create API or similar.
  • input: []const u8 is replaced by raw_audio: []const u8 to hint that it is raw audio in the devices' native format, whatever that may be.
  • recorder.read is replaced by sysaudio.convert to make it super clear that function is converting samples for you.
  • recorder: sysaudio.Recorder is now a parameter to readCallback, and player to writeCallback.
    • This gives the callback access to recorder.channels, recorder.format.size(), etc.
  • Use num_samples instead of frames, "frames" has a specific meaning in audio processing. 1 sample == 1 sample, but 1 frame == multiple samples (one for each channel.) Don't confuse the two.
  • The user should be able to request interleaved or planar format when creating a device, and sysaudio should do that conversion internally per-backend as needed.

emidoots avatar Nov 04 '23 22:11 emidoots

Do you mean locking in a specific type to the callback function? Why not have a gen function so the user could choose whatever context type she wants, whether a recorder or something else? (Add a flag to generate a function signature with a Player and we have a choice between all variants)

Your proposal would lock out the library from use in audio dev.

plaukiu avatar Nov 05 '23 19:11 plaukiu

@plaukiu ctx: Context is a generic type specified at createPlayer/createRecorder. here is an example:

cosnt MyContext = struct {
   data: [4]u8 = undefined,
};

fn main() void {
    var ctx: MyContext = .{};
    var player = try sysaudio.createPlayer(*MyContext, &ctx, .{ .writeFn = writeCallback });
}

fn writeCallback(ctx: *MyContext, raw_audio_out: []u8, player: sysaudio.Player) void {
   // do something with ctx.data
}

alichraghi avatar Nov 05 '23 21:11 alichraghi