ffmpeg-python icon indicating copy to clipboard operation
ffmpeg-python copied to clipboard

Video file in FTP to audio file in memory

Open gvidaspr opened this issue 2 years ago • 1 comments

Hello, I have this problem using ffmpeg, trying to get a file from FTP, processing it with ffmpeg, and then eventually using it for further processes (mainly send via HTTP request to do text-to-speech)

I have been trying to setup a pipeline using ffmpeg, but with no success. This is the code I have come up with:

from ftplib import FTP
import requests
import io
import ffmpeg

ftp = FTP(ftp_host)
ftp.login(user=ftp_user, passwd=ftp_password)

mp4_file = "Raw/RTV/2023.12.04_13.00_4796_1101_INFOWILNO1300_TVPWILNO.mp4"

video_buffer = io.BytesIO()
ftp.retrbinary('RETR ' + mp4_file, video_buffer.write)
video_buffer.seek(0)

process = ffmpeg.input('pipe:0').output('pipe:1', format='mp3').run_async(pipe_stdin=True, pipe_stdout=True, pipe_stderr=True, quiet=True)
out, err = process.communicate(input=video_buffer.read())

print(err)

ftp.quit()

And this is the error I get:

ffmpeg version 6.1-essentials_build-[www.gyan.dev](https://www.gyan.dev/) Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --pkg-config=pkgconf --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-dxva2 --enable-d3d11va --enable-libvpl --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000019aa6961ec0] stream 1, offset 0x30: partial file
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000019aa6961ec0] Could not find codec parameters for stream 0 (Video: h264 (avc1 / 0x31637661), none, 1280x720, 375 kb/s): unspecified pixel format
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'pipe:0':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf60.14.100
  Duration: 00:15:00.00, start: 0.000000, bitrate: N/A
  Stream #0:0[0x1](und): Video: h264 (avc1 / 0x31637661), none, 1280x720, 375 kb/s, SAR 1:1 DAR 16:9, 50 fps, 50 tbr, 10000k tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.28.100 libx264
  Stream #0:1[0x2](und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 133 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
Stream mapping:
  Stream #0:1 -> #0:0 (aac (native) -> mp3 (libmp3lame))
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000019aa6961ec0] stream 1, offset 0x30: partial file
[in#0/mov,mp4,m4a,3gp,3g2,mj2 @ 0000019aa6961d00] Error during demuxing: Invalid data found when processing input
[in#0/mov,mp4,m4a,3gp,3g2,mj2 @ 0000019aa6961d00] Error retrieving a packet from demuxer: Invalid data found when processing input
[aost#0:0/libmp3lame @ 0000019aa6e91ac0] No filtered frames for output stream, trying to initialize anyway.
Output #0, mp3, to 'pipe:1':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    TSSE            : Lavf60.16.100
  Stream #0:0(und): Audio: mp3, 44100 Hz, stereo, fltp (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.31.102 libmp3lame
[out#0/mp3 @ 0000019aa6972d80] video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[out#0/mp3 @ 0000019aa6972d80] Output file is empty, nothing was encoded(check -ss / -t / -frames parameters if used)
size=       0kB time=N/A bitrate=N/A speed=N/A

As you can see, there are some errors when processing the file. I was not able to figure them out myself - what is the problem? I would like to note that saving the file locally and then processing it works fine.

I have tried to provide the process.stdin.write as a callback to ftp.retrbinary, but the output is the same.

Has anyone encountered a similar problem? Or at least how did you do memory -> ffmpeg -> memory transformation?

gvidaspr avatar Dec 05 '23 11:12 gvidaspr