Steward Garcia

Results 92 comments of Steward Garcia

Interesting, I'll see if I can give it some time to at least try if it works. **Bad new:** To implement this, it would be necessary to create the operation...

>I gave a try but I am not sure if implemented correctly. @bssrdf Do you mean in sd.cpp? I checked your fork, and I don't see anything related.

@bssrdf Great job, to implement that functionality in sd.cpp for obvious reasons, it must also be compatible with the CPU backend. I don't know much about how to implement the...

In ggml, efforts are underway to add the missing kernels in Metal. Perhaps, when this [pull request](https://github.com/ggerganov/ggml/pull/621) is merged, I will add Metal support to stable diffusion.

@paulocoutinhox you can try enable metal backend with PR #104

I think it could be solved by separating the weights of convolutions and attention weights into different buffers, although it implies a quite cumbersome change due to the way tensor...

> Just for my info - is this limit specific to M1 Pro. For example, @slaren is it a different limit on the M3? It's IOS, Apple A15 GPU, in...

I believe the bottleneck in M1, M3 is matrix multiplication, as stable diffusion requires very large matrix multiplications. In CUDA, these are done in batches across the number of heads,...

The results of Loras with quantized models are very poor anyway. I do not recommend it.

It is not supported yet. Maybe in the future, I'll have time to investigate how embeddings work and add a pull request to support it