Text2Video
Text2Video copied to clipboard
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1
hi i can't run this program i have an gpu server with ubuntu 20 and cuda 11.6 i don't have permission for increase or decrease cuda version this is my approach:
apt-get update && \
apt-get install -y nano rsync htop git openssh-server python3-pip python3-venv ninja-build sox libsox-fmt-mp3 ffmpeg && \
ln -s /usr/bin/python3 /usr/bin/python && \
rm -rf /var/lib/apt/lists/*
git clone https://github.com/sibozhang/vid2vid.git
git clone https://github.com/sibozhang/Text2Video.git
python3 -m venv ./venv/vid2vid
source ./venv/vid2vid/bin/activate
pip3 install --upgrade pip
pip3 install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install numpy dominate requests pillow scipy pytz dominate pydub
pip install opencv-python
pip install zhon moviepy ffmpeg
then i downloaded fadg0 model files from dropbox and put in here:
vid2vid/checkpoints
this is list of files:
vid2vid/checkpoints/web
vid2vid/checkpoints/iter.txt
vid2vid/checkpoints/latest_net_D.pth
vid2vid/checkpoints/latest_net_D_f.pth
vid2vid/checkpoints/latest_net_D_T0.pth
vid2vid/checkpoints/latest_net_D_T1.pth
vid2vid/checkpoints/latest_net_D_T2.pth
vid2vid/checkpoints/latest_net_G0.pth
vid2vid/checkpoints/loss_log.txt
vid2vid/checkpoints/opt.txt
then i changed
cxx_args = ['-std=c++11']
to
cxx_args = ['-std=c++14']
in these files:
vid2vid/models/flownet2_pytorch/networks/channelnorm_package/setup.py
vid2vid/models/flownet2_pytorch/networks/correlation_package/setup.py
vid2vid/models/flownet2_pytorch/networks/resample2d_package/setup.py
then i run this file:
vid2vid/models/flownet2_pytorch/install.sh
at the end i run this command:
sh Text2Video/text2video_tts.sh "hi how are you" fadg0 f
and get this error:
hi how are you
fadg0
f
input hi how are you
stripped_input hihowareyou
person fadg0
Traceback (most recent call last):
File "tts_request.py", line 54, in <module>
sound = AudioSegment.from_mp3('./input_audio/{person}/{file_name}.mp3'.format(person=person, file_name=file_name))
File "/b/venv/vid2vid/lib/python3.8/site-packages/pydub/audio_segment.py", line 796, in from_mp3
return cls.from_file(file, 'mp3', parameters=parameters)
File "/b/venv/vid2vid/lib/python3.8/site-packages/pydub/audio_segment.py", line 773, in from_file
raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1
Output from ffmpeg/avlib:
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --prefix=/usr --extra-version=0ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable
-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-lib
jack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --en
able-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --ena
ble-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-nvenc --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
[mp3 @ 0x5572708cd700] Failed to read frame size: Could not seek to 1160.
./input_audio/fadg0/hihowareyo.mp3: Invalid argument
file_name hihowareyo
Traceback (most recent call last):
File "align_english.py", line 212, in <module>
tmpbase = '/tmp/' + os.environ['USER'] + '_' + str(os.getpid())
File "/usr/lib/python3.8/os.py", line 675, in __getitem__
raise KeyError(key) from None
KeyError: 'USER'
would you please guide me how solve the problem ?