InvokeAI icon indicating copy to clipboard operation
InvokeAI copied to clipboard

[enhancement]: Need a optimization guide on RTX 4090

Open ffdown opened this issue 2 years ago • 2 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Contact Details

[email protected]

What should this feature add?

In your demo video about version 2.2., what are these crazy generation speeds? Significant acceleration of image generation. What I see now in InvokeAI is 9-10 it/s at 512x512 and 50 steps.

With some manipulation in Automatic1111 I overclocked up to 23 it/s at the same settings.

https://disk.yandex.ru/i/EITGEi5M4XG8uQ
one more test 1280 https://disk.yandex.ru/d/IXjTxBSknuS2cg

Alternatives

It would be great if you could somehow add VoltaML it works with tensor cores and reaches as much as 30 it/s, but I have only seen it on screenshots of github, how to install it locally, I do not even understand.

Aditional Content

I watched a video about how you have all well and good works on version 2.2, tried to replace the small to do on the image field 192x192 and do not know how you, but I have downscale is working out of the blue. To put it bluntly, it's horrible.

https://disk.yandex.ru/i/pxh87YPuphyWRQ

ffdown avatar Dec 02 '22 15:12 ffdown

The demo video speeds past the generation parts to focus on the features. Maybe @hipsterusername can overlay a :fast_forward: icon or something in future videos to reduce confusion about that.

We're planning on xformers support in a later release, which will help significantly with NVIDIA RTX cards and be more competitive with the speeds you're currently seeing in A1111.

The VoltaML demos do look awesome, but I haven't yet set that up locally either. It's one of a number of a number of similar different implementations that are coming up with ways to optimize these neural networks. It's not yet clear which is the best to integrate with a project like this, but definitely something to keep an eye on!

keturn avatar Dec 02 '22 16:12 keturn

Ah that's it, xformers is not connected, and in the video acceleration, it would make sense to add information about it at least in the video description. By the way again, as one of the useful functions it would be very nice to output somewhere not only the load line, but also a specific number in it/s only for this figure you have to periodically look in the console. And InvokeAI clearly does not run on CUDA 11.8, but it is the only one that realizes the potential of RTX 4xx.

ffdown avatar Dec 02 '22 18:12 ffdown

No longer an issue on v3

psychedelicious avatar Aug 06 '23 04:08 psychedelicious