Magpie icon indicating copy to clipboard operation
Magpie copied to clipboard

[Feature Request] ONNX support

Open FNsi opened this issue 1 year ago • 61 comments

In compact structure (model size 256k~4m) that would be a runtime effect base on DirectMl

Am I so greedy?😂

FNsi avatar Dec 08 '23 04:12 FNsi

REAL-ESRGAN is to large, It's too difficult to run it in real time on current computers.

cqaqlxz avatar Dec 09 '23 15:12 cqaqlxz

REAL-ESRGAN is to large, It's too difficult to run it in real time on current computers.

The main bottleneck is memory size. 2k game + 2x enlarge cost about 16g memory. The speed can be real time in 3060 (512k model) only if the memory is unlimited. 😂

Imo that could work in igpu, though 780m still not good enough, maybe Qualcomm elite x, another story...

FNsi avatar Dec 09 '23 23:12 FNsi

Some models can indeed be inferenced in real time, such as mpv-upscale-2x_animejanai. I plan to add support for ONNX in the future, but there is still a lot of uncertainty.

Blinue avatar Dec 10 '23 06:12 Blinue

2x-DigitalFlim

The best ESR model I ever tried, not only the size but also the output(real + anime)

FNsi avatar Dec 15 '23 06:12 FNsi

The SuperUltraCompact model isn't much larger than Anime4k UL model (around 2x, I guess), so it's kinda possible to be ported to HLSL format.

kato-megumi avatar Dec 15 '23 06:12 kato-megumi

While porting to HLSL does indeed offer higher efficiency, the cost is also substantial unless there's an automated approach. I'm inclined to adopt ONNX Runtime, enabling us to seamlessly integrate any ONNX model with ease.

Blinue avatar Dec 15 '23 06:12 Blinue

i personal think this is a great idea as animejanai does offer much better grafic some times. I would personal donate 20 US if this happen. Magie is getting better everyday. Love this thing so much.

While porting to HLSL does indeed offer higher efficiency, the cost is also substantial unless there's an automated approach. I'm inclined to adopt ONNX Runtime, enabling us to seamlessly integrate any ONNX model with ease.

YingDoge avatar Jan 02 '24 09:01 YingDoge

I ported Animejanai V3 SuperUltraCompact and 2x-DigitalFlim to Magpie's effect if anyone want to try. https://gist.github.com/kato-megumi/d10c12463b97184c559734f2cba553be

Gist
magpie effect. GitHub Gist: instantly share code, notes, and snippets.

kato-megumi avatar Feb 14 '24 13:02 kato-megumi

Great job! It appears that Animejanai is well-suited for scenes from old anime, as it doesn’t produce sharp lines like Anime4K does. However, a significant issue is that it sacrifices many details. DigitalFlim is sharper than Animejanai, it also suffers from severe detail loss. In terms of performance, they are roughly 20-25 times slower than Lanczos.

Blinue avatar Feb 14 '24 14:02 Blinue

nothing happened after I put both files in effects folder (even rebooted the system)

For experiment I also put the fakehdr.hlsl and it works...

Don't know if I made any mistakes (version 10.05)

FNsi avatar Feb 16 '24 09:02 FNsi

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

kato-megumi avatar Feb 16 '24 10:02 kato-megumi

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

Thank u for your great work and help! Anyway I still don't know how to download the build from GitHub action, so let me keep that surprise till the next upcoming release.😁

However, a significant issue is that it sacrifices many details.

For that I think it's the common problem in ESR model, base on the structure (even large model can't keep many detail) and training datasets (animations?)

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

FNsi avatar Feb 17 '24 04:02 FNsi

Download from here: https://github.com/Blinue/Magpie/actions/runs/7911000525/artifacts/1246839355

Blinue avatar Feb 17 '24 05:02 Blinue

Download from here: https://github.com/Blinue/Magpie/actions/runs/7911000525/artifacts/1246839355

Thank u. After sigin again I can download it. It's wired that kind of page from action Need to be sign in (otherwise show 404) even I already signed in iOS client...

FNsi avatar Feb 17 '24 05:02 FNsi

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

GitHub**refactor: XamlWindow 禁止子类直接访问成员 · Blinue/Magpie@e3dc41b**An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

Can you port the SD model of animejanai, which is more aggressive in its detail reconstruction? an UC model for those of us with more computing power would also be great.

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

spiwar avatar Feb 19 '24 07:02 spiwar

Can you port the SD model of animejanai

@spiwar Do you have link for it? Didn't find it on their github

kato-megumi avatar Feb 19 '24 13:02 kato-megumi

For detail restore...2x-Futsuu-Anime, but its 4M... i think its a game for 4090

FNsi avatar Feb 19 '24 14:02 FNsi

animejanai.zip Here is animejanai's Compact and UltraCompact for anyone with enough power. UltraCompact run like 3fps for 720p on my machine. Havent test Compact yet.

kato-megumi avatar Feb 19 '24 17:02 kato-megumi

animejanai.zip Here is animejanai's Compact and UltraCompact for anyone with enough power. UltraCompact run like 3fps for 720p on my machine. Havent test Compact yet.

Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv. Can you port the v3 sharp model? They are in the animejanai discord beta releases.

carycary246 avatar Feb 19 '24 19:02 carycary246

Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv

Perhaps it's a limitation of magpie/hlsl. I'm hopeful that integrating ONNX will enhance its performance. What GPU are you using?

Can you port the v3 sharp model?

Ok. https://gist.github.com/kato-megumi/d10c12463b97184c559734f2cba553be#file-animejanai_sharp_suc-hlsl

Gist
magpie effect. GitHub Gist: instantly share code, notes, and snippets.

kato-megumi avatar Feb 20 '24 07:02 kato-megumi

Can you port the SD model of animejanai

@spiwar Do you have link for it? Didn't find it on their github

You can find it in the full 1.1gb release, but i've included it here for convenience. 2x_AnimeJaNai_SD_V1beta34_Compact.zip

spiwar avatar Feb 20 '24 07:02 spiwar

image

RTX 3080ti, upscale from 1080p source to 4k C model runs at seconds per frame UC model runs at 2-3 fps SUC model runs at ~40fps

If we can optimize this to run at decent speeds then it would be very nice, UC and C model looks quite natural with no oversharpening.

spiwar avatar Feb 20 '24 08:02 spiwar

The performance optimization space is very limited, because the bottleneck is in floating-point operations.

@kato-megumi I found that 16-bit floating-point numbers (min16float) are more efficient, with about a 10% performance improvement on my side. But this is still not enough to make UC usable. Further performance improvement can only be achieved by using platform-specific APIs, such as TensorRT.

image

Blinue avatar Feb 20 '24 13:02 Blinue

find data to enhance SUC model might be the better way forward... Comparing with tensor rt, directml is a universal solution, imo... (But obviously cannot gain from nv hardware acceleration) Or PyTorch compile with 8bit?

FNsi avatar Feb 20 '24 13:02 FNsi

@Blinue Sorry, can you elaborate. I though using FORMAT R16G16B16A16_FLOAT already mean 16-bit floating-point number?

kato-megumi avatar Feb 20 '24 14:02 kato-megumi

I though using FORMAT R16G16B16A16_FLOAT already mean 16-bit floating-point number?

In hlsl, float is 32-bit, R16G16B16A16_FLOAT texture stores half-precision floating-point data, but it is converted to float when sampled. You have to explicitly cast to min16float to perform half-precision operations. See https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/using-hlsl-minimum-precision

Starting with Windows 8, graphics drivers can implement minimum precision HLSL scalar data types by using any precision greater than or equal to their specified bit precision.

Blinue avatar Feb 20 '24 14:02 Blinue

Comparing with tensor rt, directml is a universal solution, imo... (But obviously cannot gain from nv hardware acceleration)

One advantage of ONNX Runtime is that it supports multiple backends, including DML and TensorRT. TensorRT is generally the fastest backend, it should be the preferred choice if available.

Blinue avatar Feb 20 '24 15:02 Blinue

Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv

Perhaps it's a limitation of magpie/hlsl. I'm hopeful that integrating ONNX will enhance its performance. What GPU are you using?

tested on 3080 and a 4090, anything more than SUC is not useable for now with hlsl. We will definitely need onnx support so we can run with TensorRT.

carycary246 avatar Feb 20 '24 23:02 carycary246

Maybe if ↓ got implement UC will be usable? https://github.com/Blinue/Magpie/discussions/610

GitHub
When playing a visual novel, a significant portion of the screen remains static most of the time. Applying heavy effects to the entire screen feels inefficient and wasteful. Is it possible to apply...

kato-megumi avatar Feb 21 '24 08:02 kato-megumi

Maybe if ↓ got implement UC will be usable? https://github.com/Blinue/Magpie/discussions/610

bloc97 makes a good point, I think #610 is hard to achieve. On one hand, especially for complex scaling algorithms like convolutional networks, it is difficult to determine what effect a pixel change has on the output. On the other hand, duplicate frame detection is already implemented, which can effectively reduce power consumption in many situations. Going further and only updating the changed areas is not very useful, because it is hard to do and only works for certain scenarios.

GitHub
When playing a visual novel, a significant portion of the screen remains static most of the time. Applying heavy effects to the entire screen feels inefficient and wasteful. Is it possible to apply...

Blinue avatar Feb 22 '24 10:02 Blinue