video2x icon indicating copy to clipboard operation
video2x copied to clipboard

Hitting mysterious hangs on a specific frame when upscaling

Open ghostis opened this issue 2 years ago • 7 comments

Thanks so much for this tool!

Some runs have been OK, some are not.

I'm running into mysterious hangs on a specific frame during some runs. The exact frame depends on the source video file, but it's always the same frame for a particular file.

Setup:

Debian 11 (also experienced on Debian 10) RTX2060 in TB3 eGPU on Intel NUC NVIDIA Driver 520.56.06 CUDA Version: 11.8 FFMPEG version: ffmpeg version 4.3.5-0+deb11u1 ... built with gcc 10 (Debian 10.2.1-6)

Tried -p1, -p2, and -p4 (depending on algorithm's capabilities)

Tried 515 "Production Branch" NVIDIA driver (vs 520 "New Features" driver currently installed)

First hit this on Debian 10 with 4.x kernel. Tried upgrading to Debian 11 with 5.x kernel.

It can take awhile to hit using RealSR because of compute cost per frame, but on animation using realcugan, I hit it quickly with a file that's affected.

Source files are mpeg2 (or are converted to), deinterlaced if needed.

Happens with both RealSR and realcugan, so I think it may be an ffmpeg issue.

strace shows strange timeouts for the Python process:

---SNIP--- read(20, "\200\4\225\16\0\0\0\0\0\0\0\214\7#RETURN\224N\206\224.", 25) = 25 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=100000}) = 0 (Timeout) futex(0x93a568, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=2328, tv_nsec=432330575}, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a568, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=2328, tv_nsec=433830821}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a5c4, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x93a5c8, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x93a56c, FUTEX_WAKE_PRIVATE, 1) = 1 write(20, "\0\0\0+\200\4\225 \0\0\0\0\0\0\0(\214\f7f8cdf299f00\224\214"..., 47) = 47 futex(0x93a568, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=2328, tv_nsec=440655535}, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a56c, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=2328, tv_nsec=441497048}, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x93a570, FUTEX_WAKE_PRIVATE, 1) = 0 read(20, "\0\0\0\31", 4) = 4 read(20, "\200\4\225\16\0\0\0\0\0\0\0\214\7#RETURN\224N\206\224.", 25) = 25 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=100000}) = 0 (Timeout) ---SNIP---


I'm willing to troubleshoot more, but I'm not sure what to try next.

ghostis avatar Nov 17 '22 18:11 ghostis

I've noticed similar behavior. Certain files complete successfully. Other files get stuck at a certain point and will get stuck again at the same point when the process is run again. This often occurs at 99% with only a few frames left to process, although I recently tried a file which got stuck at 80%.

colino17 avatar Apr 05 '23 18:04 colino17

@colino17 any luck dealing with this? I'm having the same when processing a small .mp4 file with srmd.

ealtamir avatar Jul 06 '23 15:07 ealtamir

@colino17 any luck dealing with this? I'm having the same when processing a small .mp4 file with srmd.

My workaround is to re-encode files using some standard settings before I upscale them. I wrote a small bash script that does this to any video file within the same folder as the script with the outputs going to an "out" subfolder.

#!/bin/sh

# FOR EVERY VIDEO FILE IN FOLDER
for i in *.{mkv,mp4,avi,mov,xvid,webm}
do
# CONVERT TO MP4
  ffmpeg -hide_banner -i "$i" -sn -c:v libx264 -crf 10 -preset ultrafast -movflags +faststart -c:a aac "out/${i%.*}.mp4"
  mv "$i" "done/$i"
done

This encodes everything as a x264 MP4 file with AAC audio and the moov atom at the start of the file. The CRF is set very low to minimize any quality loss and the preset is set to ultrafast because as these are temporary files used for upscaling speed matters more than file size.

You'll need a "done" subfolder (your original files move here after they've been processed) and an "out" subfolder (where the re-encoded files will be stored).

colino17 avatar Jul 06 '23 15:07 colino17

Yes, this worked for me. Many thanks for the tip!

ealtamir avatar Jul 07 '23 09:07 ealtamir

In my case My files were already lib264 encoded, but I transcoded them just to be sure. Also the behaviour was different, as it always hung on the first frame.

In the end, the reason was that the example command I picked from here doesn't work if I just replaced waifu2x with realsr. The example command spawns three processes with -p3 which is fine for waifu2x which runs at ~3fps on my eight year old hardware (NVIDIA‌ GeForce GTX 960)

RealSR will only run properly with -p2 or -p1 and performance is only marginally improved in -p2 mode, so I'm now running it in -p1 mode.

Ghostbird avatar Aug 29 '23 19:08 Ghostbird

Hmmm, I just had also a mysterious hang, where after leaving the computer unattended for a while, with two different invocations of video2x running, when I came back both of the processes where "hanging". I use quotes because the process itself was not fully hanged: the time elapsed indicator was still moving forward, but all other indicators (including the FPS and the frame count) did not move.

arximboldi avatar Jun 27 '24 12:06 arximboldi

Ok... so it also seems to be deterministic this hang... Only happening with some algorithms. Really annoying...

arximboldi avatar Jun 27 '24 14:06 arximboldi