chaiNNer Tile input changes

Adds tile option to pytorch
Updates NCNN and ONNX tiling options

My only gripe with this is that we kinda have no way to explain what this is to people, and I'm afraid we'll get a lot of questions related to it. Hopefully it being "auto" by default means people won't touch it unless they're told to

Edit: does this need a migration? I feel like this needs a migration but I'm not sure how to migrate it

Aug 10 '22 21:08 joeyballentine

I have one major problem with this: whether an upscale runs out of memory depends on the tile size, not the number of tiles/split depth. This means that if Auto doesn't work for you, then you'll have to set the number of tiles on a per-image basis. This makes any option other than Auto very annoying to use. So users will likely set a huge split factor since that will work for pretty much all images, but that will make the upscale slower than it has to be for them.

So I think it would be better if we gave a "Tile Size" dropdown with the options Auto, 64px, 128px, 256px, 512px, 1024px. Why those values? Powers of two because they are nice numbers and exponentially increase. 64px because pretty much every machine should be able to process an image with only 4096 pixels total and going lower to 32px doesn't make sense because we add 16px overlap on all 4 sides. 1024px because that tile size requires roughly 8GBs of VRAM on my machine for PyTorch with an RGB image. Given that tile size is an option for when chainner can't efficiently narrow down the tile size, this should be large enough.

Aug 11 '22 09:08 RunDevelopment

Also, the auto splitting logic seems to be duplicated for all 3 architectures. We should probably refactor that similar to how I refactored the upscaling implementations (#524). However, that can be its own PR after this one. I'll probably do it.

Aug 11 '22 09:08 RunDevelopment

Also, the auto splitting logic seems to be duplicated for all 3 architectures. We should probably refactor that similar to how I refactored the upscaling implementations (#524). However, that can be its own PR after this one. I'll probably do it.

You can't easily do this, at least between ncnn and pytorch, since they use different shapes. Pytorch and onnx use bchw and ncnn uses hwc

Aug 11 '22 13:08 joeyballentine

whether an upscale runs out of memory depends on the tile size, not the number of tiles/split depth.

This is true. But keep in mind the maximum tile size that will work changes depending on what model you're using. A lighter model can upscale a higher maximum resolution than a heavier one. So your 1024 for 8GB example only works for a normal size esrgan model. If we add something like SwinIR or HAT, the maximum image size for 8GB is going to be significantly lower, while something like a regular srvggcompact model can upscale a 2k image no problem. (This is the main reason I created the automatic code)

Aug 11 '22 13:08 joeyballentine

That being said though I pretty much only did the number of tiles thing so I wouldn't have to write new splitting code (out of laziness), so I wouldn't mind actually doing that and changing it to be a normal tile size. My point is just that tile sizes aren't a "one size fits all" solution, and if we add something like a model iterator ever, tile size won't necessarily be able to work for every model and you'd end up using a much smaller tile size with models where you can use a larger one. That's still probably better than the number of tiles approach though

Aug 11 '22 13:08 joeyballentine

But keep in mind the maximum tile size that will work changes depending on what model you're using.

True. Then how about we treat the tile size not as "use this tile size" but as "this is the maximum tile size, start the auto splitting from there"?

Aug 11 '22 13:08 RunDevelopment

True. Then how about we treat the tile size not as "use this tile size" but as "this is the maximum tile size, start the auto splitting from there"?

I'm not really sure how this helps, there's so many factors that would change what a maximum tile size would be. The amount of vram you have, the amount of vram currently in use, what model you're using, the params (nf, nb, etc) of the model.

And in the ideal case, you want to use the maximum tile size you possibly can for each model. So setting one maximum doesn't seem like it helps anything

Aug 11 '22 14:08 joeyballentine

So setting one maximum doesn't seem like it helps anything

Well, yes, but it's better than a split depth/number of tiles.

Our ideal case is auto splitting just working anyway. But you are right, if auto splitting didn't work and people had to resort to manual tile sizes, then auto splitting based on that tile size probably wouldn't help.

Aug 11 '22 14:08 RunDevelopment

chaiNNer chaiNNer copied to clipboard

Tile input changes

chaiNNer
chaiNNer copied to clipboard