Text2Video icon indicating copy to clipboard operation
Text2Video copied to clipboard

pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Open se8820726 opened this issue 1 year ago • 1 comments

hi i can't run this program i have an gpu server with ubuntu 20 and cuda 11.6 i don't have permission for increase or decrease cuda version this is my approach:

apt-get update && \
  apt-get install -y nano rsync htop git openssh-server python3-pip python3-venv ninja-build sox libsox-fmt-mp3 ffmpeg && \
  ln -s /usr/bin/python3 /usr/bin/python && \
  rm -rf /var/lib/apt/lists/*



git clone https://github.com/sibozhang/vid2vid.git
git clone https://github.com/sibozhang/Text2Video.git

python3 -m venv ./venv/vid2vid
source ./venv/vid2vid/bin/activate
pip3 install --upgrade pip
pip3 install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install numpy dominate requests  pillow scipy pytz dominate pydub
pip install opencv-python
pip install zhon moviepy ffmpeg 

then i downloaded fadg0 model files from dropbox and put in here: vid2vid/checkpoints this is list of files:

vid2vid/checkpoints/web
vid2vid/checkpoints/iter.txt
vid2vid/checkpoints/latest_net_D.pth
vid2vid/checkpoints/latest_net_D_f.pth
vid2vid/checkpoints/latest_net_D_T0.pth
vid2vid/checkpoints/latest_net_D_T1.pth
vid2vid/checkpoints/latest_net_D_T2.pth
vid2vid/checkpoints/latest_net_G0.pth
vid2vid/checkpoints/loss_log.txt
vid2vid/checkpoints/opt.txt

then i changed cxx_args = ['-std=c++11'] to cxx_args = ['-std=c++14'] in these files:

vid2vid/models/flownet2_pytorch/networks/channelnorm_package/setup.py
vid2vid/models/flownet2_pytorch/networks/correlation_package/setup.py
vid2vid/models/flownet2_pytorch/networks/resample2d_package/setup.py

then i run this file: vid2vid/models/flownet2_pytorch/install.sh

at the end i run this command: sh Text2Video/text2video_tts.sh "hi how are you" fadg0 f

and get this error:

hi how are you
fadg0
f
input hi how are you
stripped_input hihowareyou
person fadg0
Traceback (most recent call last):
  File "tts_request.py", line 54, in <module>
    sound = AudioSegment.from_mp3('./input_audio/{person}/{file_name}.mp3'.format(person=person, file_name=file_name))
  File "/b/venv/vid2vid/lib/python3.8/site-packages/pydub/audio_segment.py", line 796, in from_mp3
    return cls.from_file(file, 'mp3', parameters=parameters)
  File "/b/venv/vid2vid/lib/python3.8/site-packages/pydub/audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable
-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-lib
jack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --en
able-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --ena
ble-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
[mp3 @ 0x5572708cd700] Failed to read frame size: Could not seek to 1160.
./input_audio/fadg0/hihowareyo.mp3: Invalid argument

file_name hihowareyo
Traceback (most recent call last):
  File "align_english.py", line 212, in <module>
    tmpbase = '/tmp/' + os.environ['USER'] + '_' + str(os.getpid())
  File "/usr/lib/python3.8/os.py", line 675, in __getitem__
    raise KeyError(key) from None
KeyError: 'USER'

would you please guide me how solve the problem ?

se8820726 avatar Mar 21 '23 08:03 se8820726