audio icon indicating copy to clipboard operation
audio copied to clipboard

add map option to io.StreamWriter audio track

Open AngeloGiacco opened this issue 7 months ago • 0 comments

🚀 The feature

FFmpeg makes it easy to write raw audio data without any headers. For example for opus encoding, it looks something like this:

input_stream = ffmpeg.input(
                "pipe:0",
                format="s16le",
                ar=self.original_sample_rate,
                ac=self.n_channels,
            )
            map = "0:a" if self.raw else ""
            format = "data" if self.raw else "opus"
            output_stream = ffmpeg.output(
                input_stream,
                "pipe:1",
                format=format,
                acodec="libopus",
                audio_bitrate=audio_bitrate_str,
                ar=self.out_sample_rate,
                application="audio",
                map=map,
            )

torchaudio.io.StreamWriter makes it easy to supply the format dynamically, but it is not possible to supply the map parameter. This would be very useful so that we can define our encoders in torchaudio and add options for raw or complete audio encoding

Motivation, pitch

This is very useful for conversational AI and audio streaming.

Alternatives

No response

Additional context

No response

AngeloGiacco avatar Mar 25 '25 10:03 AngeloGiacco