Anime4K icon indicating copy to clipboard operation
Anime4K copied to clipboard

Question: Type of algorithm

Open HelpSeeker opened this issue 4 years ago • 13 comments

Why describe this algorithm an as upscaling algorithm and not a sharpening algorithm? From what I understand it doesn't have anything to do with the actual scaling process and relies on conventional scaling algorithms.

HelpSeeker avatar Aug 15 '19 07:08 HelpSeeker

We have to define "what is upscaling". If you see bicubic upscaling as an higher order expansion of bilinear, this algorithm can also be seen as an enhancement to bilinear upscaling, as it can be used with it. Bilinear is so trivial now that people do not even see it as "upscaling" anymore, so the fact that this algorithm, even used with something as trivial as Bilinear can achieve reasonable results, you can say that the heavy lifting of the upscaling operation was done by Anime4K, and not the bilinear step.

The main goal is to let users watch 1080p anime on 2160p screens. The most descriptive way of saying that is to call this an upscaling algorithm. But we know that technically, it is an iterative edge refinement algorithm.

bloc97 avatar Aug 15 '19 17:08 bloc97

Bilinear is so trivial now that people do not even see it as "upscaling" anymore

But that is simply wrong and you don't make the situation better by motivating people to use incorrect terminology.

you can say that the heavy lifting of the upscaling operation was done by Anime4K

I don't see the logic behind that statement. Your algorithm has simply nothing to do with the scaling process (the process of increasing the resolution of a video signal). You can chain many operations on a frame to get visually appealing results. That doesn't change the role of each operation.

But we know that technically, it is an iterative edge refinement algorithm.

That is the problem though. Not everyone knows and this will only cause further confusion. Is it really worth it to spread misinformation, just because people tend to be more interested in upscaling than edge refinement algorithms?

HelpSeeker avatar Aug 15 '19 20:08 HelpSeeker

By your definition most early neural network upscaling algorithms are not upscaling either. Before the discovery of the effectiveness of Transposed Convolutions, most upscaling algorithms such as SRCNN and VDSR take as an input an already upscaled version of the low resolution image, with bilinear/bicubic filtering. The original implementation of waifu2x does this too, however I'm not sure if they still do it now that people are using Transposed Convolutions or PixelShuffle Layers everywhere.

bloc97 avatar Aug 15 '19 20:08 bloc97

Only if you restrict the definition of resolution to the amount of pixels. Image resolution can also refer to the detail an image holds, which is why super-resolution algorithms can fairly claim to increase the resolution of an image.

HelpSeeker avatar Aug 15 '19 21:08 HelpSeeker

Sorry but I just don't see the difference between Low Resolution -> Bicubic -> VDSR -> High Resolution and Low Resolution -> Bicubic -> Anime4K -> High Resolution The "Upscaling" algorithm in these two cases is not Bicubic, as its contribution is minimal.

bloc97 avatar Aug 16 '19 01:08 bloc97

Anime4K in no way increases the resolution (neither pixel- nor detail- wise). Anime4K alters the image in a very specific way, not with the goal to increase the resolution, but to make the content more pleasant for a certain audience (people who like sharp edges over everything else). In fact you destroy a lot of detail when running Anime4K.

From my perspective you are trying to sell me

Low Resolution -> Bicubic -> High Resolution -> Denoiser -> Altered High Resolution

as

Low Resolution -> Bicubic -> Denoiser -> High Resolution

HelpSeeker avatar Aug 16 '19 08:08 HelpSeeker

The smoothing of texture is something that can be taken care of with a better line detection algorithm, which is what we're currently working on.
Other upscalers use those better detection algorithms, and without them they will destroy texture too. The fact that such a simple algorithm that can be described in 5 lines of pseudocode works as well, we can incorporate all the techniques other algorithms use and it will only get better.

Otherwise even right now, if you take the Anime4K upscales and downscale them to 1080p, you will notice that the detail loss is minimal.

bloc97 avatar Aug 16 '19 22:08 bloc97

The waifu2x neural network actually uses 2x2 pixel duplication for the input (nearest neighbor), so yeah, it's technically a straight filter, not an upscaler.

marcan avatar Aug 18 '19 19:08 marcan

The waifu2x neural network actually uses 2x2 pixel duplication for the input (nearest neighbor)

Do you have a reference for this?

If I quickly check the paper, I see that:

paper

and that:

pipeline

I think it makes sense to call Waifu2x a Super-Resolution algorithm.


As for Anime4k, it depends what the word encompasses. If it is the whole pipeline, then it is an upscaler. If it just the novel part, then it is a sharpening filter, which can be used in a larger pipeline to upscale anime images with a pleasing effect.

I think the confusion stems from the fact that @HelpSeeker thought Anime4k was claiming to do Super-Resolution, which it does not do because it is not adding information to the image, e.g. by merging several views, or by bringing in information learnt on other images.

woctezuma avatar Aug 18 '19 20:08 woctezuma

Depends on your definition of Super-Resolution too. If you define SR as recovering texture detail, then no, Anime4K is not a texture SR algorithm. However, it can recover lines from blurry upscales, thus can be considered a line SR algorithm.

I think the real confusion here is between SISR Algorithm and General-Purpose SISR Algorithm.
Anime4K is not a General-Purpose SISR Algorithm.
Waifu2x is not a General-Purpose SISR Algorithm since it was only trained on Anime Art. (With the exception of the waifu2x trained on pictures, but that's basically SRCNN with modifications.)
SRCNN is a General-Purpose SISR Algorithm since it can be trained on any subset of images from our universe.
Bicubic is a General-Purpose SISR Algorithm as it is not biased towards any kind of spatial data.

If you apply SR algorithms for MRI scans on real pictures, you would get garbage, yet it is still an SR algorithm. (And if you look carefully in the implementations MRI SR algorithm are similar to denoising algorithms, yet they are called SR.)

bloc97 avatar Aug 18 '19 21:08 bloc97

Actually I have the same confusion after understanding this algorithm. It seems that it works as a sharpening algorithm like USM but specially designed for anime instead of upscaling algorithm like bicubic interpolation (which actually upscale your image to a bigger size). However, it is still a very interesting method in these days when ML-based methods get heated too much. Thank you for sharing this.

net2cn avatar Sep 26 '19 04:09 net2cn

Re: waifu2x using a box/nearest neighbor filter, see:

https://github.com/nagadomi/waifu2x/blob/master/lib/reconstruct.lua#L192

marcan avatar Sep 26 '19 05:09 marcan

After some thought and doing some research online, I finally understood why people thought this is not upscaling. I have added a small paragraph at the end of the FAQ dedicated to those people. Years of working in this field had made me understand the words differently than most people.

The action of de-blurring (gaussian) and super-resolution is known to be equivalent in this domain, but this was not the case for people less invested in image processing.

I apologize for any confusion my earlier comments might have given.

bloc97 avatar Oct 10 '19 18:10 bloc97