waifu2x-caffe icon indicating copy to clipboard operation
waifu2x-caffe copied to clipboard

Lossless YUV input, is it possible?

Open Dioxaz opened this issue 8 years ago • 7 comments

First of all, greetings for making something like waifu2x and watfu2x-caffe possible, as they clearly are the future in upscaling, especially low-resolution sources as they're starting to look very dated and uncomfortable to work with or look at.

I recently did some tests with CUDA acceleration and am really impressed by the results. Problem: what interests me isn't upscaling still images but low-resolution videos which you can still find on streaming platforms like Youtube and Nico Nico Douga.

It is indeed possible to convert such videos to an image sequence and then pass the folder into waifu2x-caffe. Second problem: waifu2x seems to operate better in native YUV than with YUV-converted-to-RGB sources and create some nasty upscaled jaggies on chroma with contrasted solid colour areas.

I'll be using this frame (512x384) example from this video to illustrate my point: sm24627927_0473

Below is PNG input converted from video (watch the green ribbon next to the hair, on the left notably): sm246279270473 y scale width 1440

One solution I found is passing JPEGs converted from video. This way, the yuv420p colorspace is kept. waifu2x indeed treats this as YUV and by using the "2-D illust (Y Model)" model, nasty chroma jaggies are gone but! Third problem: I just can't seem to be able to produce proper lossless JPEGs from ffmpeg so my resulting pictures always have a tiny amount of degradation, which is problematic if the source video resolution is 512x384 or 640x360 (and I simply failed to encode anything in ljpeg or lossless jpeg as I'm always getting an error with ffmpeg).

Example of passed JPEG to waifu2x. Notice the slight degradation, even at max settings with ffmpeg (qmin and qmax at 1 and -q:v at 0). Also, no jaggies around the green ribbon: sm246279270473 y scale width 1440 jpg

So far my only working solution is creating a lossless WebP image sequence from my videos and only this way I'm able to keep waifu2x working in YUV without conversion to RGB. But, as it would be too easy, WebM encoding is painfully slow (2fps average on my ageing Q6600): sm246279270473 y scale width 1440 webp

I also tried JPEG2000 and TIFF image sequences, in order to keep the yuv420p colorspace but unfortunately, waifu2x converts them to RGB prior to treatment and the nasty chroma jaggies are back. sm246279270473 y scale width 1440 jp2

So this leads me to my question (I hope I wasn't boring). Is it possible to feed lossless YUV still pictures to waifu2x, other than the really slow WebP format? Is there some particular manipulation I forgot or is it simply not possible at the moment?

Regards, and keep up the good work.

Dioxaz avatar Dec 07 '16 19:12 Dioxaz

Is PNG not enough? It's lossless image format with gzip compressor and it's 4~5x faster than webp encoding.

ffmpeg -i video.mp4 -f image2 frames/%06d.png

nagadomi avatar Dec 07 '16 20:12 nagadomi

This is the first thing I tried actually. The second picture in my post actually shows how it looks upscaled.

PNG only supports RGB to my knowledge (no YUV inside its specs). I'm aware caffe is not really meant for video. But I'd really love an efficient YUV upscaling chain when dealing with video (which means in yuv420p colour format for most cases) rather than still images. Ideally, when editing video, one should avoid multiple colorspace conversions as much as possible. Those chroma jaggies do look really distracting comparing to what I'm used to with my nnedi3_rpow+warpsharp+deen filter chain. Sure I could separate chroma and luma but that would make the process even more cumbersome.

WebP is the only format allowing me to keep the upscaling chain in YUV during the whole process (except the resulting upscaled pictures which are still RGB anyway, but it's okay at this stage). JPEG2000 and TIFF (the ones I tried) even when storing yuv420p frames were converted back to RGB on input by caffe.

For the moment, the only existing Avisynth waifu2x filter is inefficient and doesn't support CUDA, that's why I'm here. :P (there's the aviutl solution I didn't try yet)

Dioxaz avatar Dec 07 '16 21:12 Dioxaz

I am guessing this problem is caused by gamma handling in waifu2x-caffe. The PNG image that is output from ffmpeg contains the embed gamma parameter and waifu2x-caffe will ignore it. Could you upload the source image? (PNG and WebP)

For the moment, the only existing Avisynth waifu2x filter is inefficient and doesn't support CUDA, that's why I'm here. :P (there's the aviutl solution I didn't try yet)

VapourSynth suppors waifu2x filter using waifu2x-caffe. https://github.com/HomeOfVapourSynthEvolution/VapourSynth-Waifu2x-caffe (But I am not familar with this software)

nagadomi avatar Dec 07 '16 22:12 nagadomi

Here's the source image (frame 473) file in both PNG: http://www.4shared.com/photo/3-cU3bTNce/sm24627927_0473.html

And WebP: http://www.4shared.com/file/A5qmuL0Eba/sm24627927_0473.html

Both are straight from ffmpeg. The PNG linked here should be identical to the first picture I linked in my first post, I'm reuploading it just in case.

As I said and noticed, the WebP produced by ffmpeg (with "-lossless 1" option) is kept as yuv420p by default and waifu2x-caffe never attempts to convert it to RGB prior to upscaling. Only the resulting picture is RGB. Also, the PNG produced by ffmpeg (2.8.4 in my case) doesn't feature any gAMA chunk but feature a pHYs one instead, which appears be a bit useless for me.

I think I need to tell ffmpeg to better upscale chroma when converting to PNG in some manner (it looks like it's using nearest neighbour to convert from yuv420p to rgb). This should "minimize the damage". Otherwise I think I can live with it for the moment. It's not too bad if better native YUV support can't be implemented yet, as it's designed for pictures anyway (even if it would be nicer indeed).

For video, I agree VapourSynth is the better choice and is indeed the one I should try, as I'm also not familiar with it yet (and its waifu2x filter supports CUDA!).

EDIT: I found out how to resample chroma directly in ffmpeg: ffmpeg -i sm24627927.mp4 -vf scale=512:384 -sws_flags accurate_rnd+full_chroma_int+full_chroma_inp sm24627927_%04d.png

Visually, this gives me results very similar to my WebP inputs so I may proceed this way for my next attempts (note: "-vf scale=512:384" might be useless in this case). It seems like waifu2x-caffe never operates in YUV internally in fact. But instead, I'm under the impression that JPG and WebP inputs have their chroma resampled prior to treatment, while other formats don't. So I think I found a workaround for the time being. So native YUV input and processing would be very nice indeed (and even taking Avisynth's EBMP format on input), but I don't think it should be a priority, as properly chroma-resampled PNGs (from video) already give very nice results.

Dioxaz avatar Dec 08 '16 18:12 Dioxaz

Over time, the yuv420p format does not appear to be supported within the caffe yet. The tiff file is also converted to rgb during the conversion process, causing color loss, hence, I would like to request add the pixel format option in the new version. We want to be upscaled without color loss.

ghost avatar Jun 29 '19 04:06 ghost

I've noticed over time that depending on the denoising setting you use, not only you get stronger denoising but also chroma smoothing! If fed with a properly chroma-upscaled PNG (using ConvertToRGB24(matrix="Rec709") in Avisynth for instance if the source is H264) on input and with a denoising setting of 1 and the 2d-illust (UpRGB model), caffe already produces jagged-free results. That is on animation or MMD-like material.

Also, there's now dandere2x, an alternative for treating video more efficiently. I didn't try it yet, but it might give better and more efficient results.

Dioxaz avatar Jun 29 '19 10:06 Dioxaz

While avoiding conversions is always nice (though it's arguable if colorspace couldn't affect the NN design) this "night and day" difference has nothing to do with that. These were all bugs in ffmpeg.

The latest post hit https://trac.ffmpeg.org/ticket/9167 While OP found https://trac.ffmpeg.org/ticket/979 too

mirh avatar Feb 04 '22 15:02 mirh