ffmpeg-python Combining an audio and video stream pipe:

I searched for such an example in the documentation but I couldn't devise a solution. I have two streams.

# preamble
file_name = 'test.mkv'

probe = ffmpeg.probe(file_name)
video_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'video'), None)
width = int(video_stream['width'])
height = int(video_stream['height'])

Video:

out, error = (
    ffmpeg
        .input(file_name, threads=120)
        .output("pipe:", format='rawvideo')
        .run(capture_stdout=True)
)

Audio:

out_a, err = (
    ffmpeg
        .input(file_name)
        .output('-', format='f32le', acodec='pcm_f32le', ac=1, ar='48000')
        .run(capture_stdout=True, capture_stderr=True)
)

I have tried outputting them but I get an error saying TypeError: cannot unpack non-iterable Popen object.

video_pipe = ffmpeg.input('pipe:', format='rawvideo', s='{}x{}'.format(width, height))
audio_pipe = ffmpeg.input('pipe:', format='f32le', acodec='pcm_f32le', ac=1, ar='48000')

combined_out, err = (
    ffmpeg
        .output(video_pipe, audio_pipe, "test_3.mkv", r='30/1')
        .overwrite_output()
        .run_async(pipe_stdin=True)
)

Terminal Output:

Traceback (most recent call last):
  File "...\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-23-ad1bd734b250>", line 8, in <module>
    .run_async(pipe_stdin=True)
TypeError: cannot unpack non-iterable Popen object
ffmpeg version 4.3.2-2021-02-20-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, rawvideo, from 'pipe:':
  Duration: N/A, bitrate: 196992 kb/s
    Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 608x1080, 196992 kb/s, 25 tbr, 25 tbn, 25 tbc
Guessed Channel Layout for Input Stream #1.0 : mono
Input #1, f32le, from 'pipe:':
  Duration: N/A, bitrate: 1536 kb/s
    Stream #1:0: Audio: pcm_f32le, 48000 Hz, mono, flt, 1536 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (pcm_f32le (native) -> vorbis (libvorbis))
[libx264 @ 0000029f0b4962c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0000029f0b4962c0] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 0000029f0b4962c0] 264 - core 161 r3048 b86ae3c - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'test_out.mkv':
  Metadata:
    encoder         : Lavf58.45.100
    Stream #0:0: Video: h264 (libx264) (H264 / 0x34363248), yuv420p, 608x1080, q=-1--1, 30 fps, 1k tbn, 30 tbc
    Metadata:
      encoder         : Lavc58.91.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
    Stream #0:1: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 48000 Hz, mono, fltp
    Metadata:
      encoder         : Lavc58.91.100 libvorbis
frame=    0 fps=0.0 q=0.0 Lsize=       4kB time=00:00:00.00 bitrate=N/A speed=   0x    
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:3kB muxing overhead: unknown

I have also tried using the concat method but to no success.

video_pipe = ffmpeg.input('pipe:', format='rawvideo', s='{}x{}'.format(width, height))
audio_pipe = ffmpeg.input('pipe:', format='f32le', acodec='pcm_f32le', ac=1, ar='48000')

combined_out, err = (
    ffmpeg
        .concat(video_pipe, audio_pipe, v=1, a=1)
        .output("test_out.mkv", r='30/1')
        .overwrite_output()
        .run_async(pipe_stdin=True)
)

Terminal Output:

Traceback (most recent call last):
  File "...\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 3437, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-22-d3e96f1793d4>", line 9, in <module>
    .run_async(pipe_stdin=True)
TypeError: cannot unpack non-iterable Popen object
ffmpeg version 4.3.2-2021-02-20-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Input #0, rawvideo, from 'pipe:':
  Duration: N/A, bitrate: 196992 kb/s
    Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 608x1080, 196992 kb/s, 25 tbr, 25 tbn, 25 tbc
Guessed Channel Layout for Input Stream #1.0 : mono
Input #1, f32le, from 'pipe:':
  Duration: N/A, bitrate: 1536 kb/s
    Stream #1:0: Audio: pcm_f32le, 48000 Hz, mono, flt, 1536 kb/s
Stream mapping:
  Stream #0:0 (rawvideo) -> concat:in0:v0
  Stream #1:0 (pcm_f32le) -> concat:in0:a0
  concat:out:a0 -> Stream #0:0 (libvorbis)
  concat:out:v0 -> Stream #0:1 (libx264)
[libx264 @ 000001ce4a2f9380] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 000001ce4a2f9380] profile High, level 3.1, 4:2:0, 8-bit
[libx264 @ 000001ce4a2f9380] 264 - core 161 r3048 b86ae3c - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, matroska, to 'test_out.mkv':
  Metadata:
    encoder         : Lavf58.45.100
    Stream #0:0: Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 48000 Hz, mono, fltp (default)
    Metadata:
      encoder         : Lavc58.91.100 libvorbis
    Stream #0:1: Video: h264 (libx264) (H264 / 0x34363248), yuv420p, 608x1080, q=-1--1, 30 fps, 1k tbn, 30 tbc (default)
    Metadata:
      encoder         : Lavc58.91.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=    0 fps=0.0 q=0.0 Lsize=       4kB time=00:00:00.00 bitrate=N/A speed=   0x    
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:3kB muxing overhead: unknown

Any help resolving this would be greatly appreciated.

Mar 19 '21 11:03 RashiqAzhan

Did you solved it @RashiqAzhan ??

May 16 '21 21:05 LentilStew

Any solution to this?

Mar 16 '22 08:03 tamararobin1982

These solution worked for me. I was editing audio so converted it from bytes to a FilterableStream using input. input_ and video were already FilterableStreams. input_ had other audio tracks and subtitles which were preserved.

For some reason mkv's don't output if you give it a format='mkv', in the output. 'pipe:0' is used or piping inputs in ffmpeg.input(), and 'pipe:1' is used to pipe outputs in ffmpeg.output. Although output can take FilterableStreams as inputs, for some reason you can not pipe them in using 'pipe:0'.

Adds an audio track in mkv as well and has subs (Very very very slow)

process3 = (
        ffmpeg.input('pipe:0', format='f32le', ar=sample_rate)
        .output(input_, filename=output_path) #'pipe:0'
        .overwrite_output()
        .run_async(pipe_stdin=True, pipe_stderr=True, quiet=True)
        )
err = process3.communicate(input=audio)

Works on mkv, no subs (Very very very slow)

process5 = (
        ffmpeg.input('pipe:0', format='f32le', ar=sample_rate)
        .output(video, filename=output_path)
         .overwrite_output()
        .run_async(pipe_stdin=True, pipe_stderr=True, quiet=True)
            )
err = process5.communicate(input=audio)

Works on mkv, no subs (fast)

process7 = (
    ffmpeg.input('pipe:0', format='f32le', ar=sample_rate)
    .output(video, filename=output_path, vcodec='copy', acodec='aac', strict='experimental')
    .overwrite_output()
    .run_async(pipe_stdin=True, pipe_stderr=True, quiet=True))
err = process7.communicate(input=audio)

And if you know the stream number for the file type, then you can also do the follows

Works on mkv, has subs and also keep old audio track (Normal speed)

process8 = (
    ffmpeg.input('pipe:0', format='f32le', ar=sample_rate)
    .output(input_['0'], input_['2'], input_['1'], filename=output_path, vcodec='copy', acodec='aac')
    .overwrite_output()
    .run_async(pipe_stdin=True, pipe_stderr=True, quiet=True))
err = process8.communicate(input=audio)

Sep 02 '23 06:09 mAb-Engineers

I also need a solution to this issue

Jun 01 '24 08:06 xiaofanku

ffmpeg-python ffmpeg-python copied to clipboard

Combining an audio and video stream pipe:

Adds an audio track in mkv as well and has subs (Very very very slow)

Works on mkv, no subs (Very very very slow)

Works on mkv, no subs (fast)

Works on mkv, has subs and also keep old audio track (Normal speed)

ffmpeg-python
ffmpeg-python copied to clipboard