Steward Garcia

Results 109 comments of Steward Garcia

@LeonNerd You can now activate the CUDA backend with `-DSD_CUBLAS=ON`, @klosax you can close this issue.

I'm one of the individuals directly involved in the development of sd.cpp, and I recognize that the project has many limitations and numerous cases mishandled. With time and more assistance...

I want to shed some light on this topic. The truth is that adding Flash Attention to ggml has made the library very heavy (4MB -> 120MB) and significantly increased...

Hello, good luck trying to use the tensor cores to get Winograd working; I think it’s possible. I saw an implementation that used matrix multiplication in Tinygrad. I found a...

@JohannesGaessler I know it's out of context, but I'm compiling the latest version of stable diffusion.cpp and now it's taking more than 25 minutes to compile the CUDA code. Before...

I suppose that by im2col you mean the entire conv2d process (im2col and gemm), and what Winograd does is the complete convolution.

@wandbrandon I am trying to compile that version on Windows, but it throws the following error, which shouldn't happen if they only modified the convolution operation in GGML. However, as...

I'm very interested in this PR; I wish I had time to test DeepCache in Comfy UI and compare the results with your PR.

What is CFG? According to my understanding, it's when we pass the `--cfg-scale` parameter. Why do they refer to it as something that's missing in this project? Or is it...

@slaren Honestly, I think Flash Attention should be an optional feature in ggml since it doesn't introduce significant performance improvements, and the binary size has increased considerably—not to mention the...