Daniele

Results 18 issues of Daniele

This PR adresses the issue https://github.com/leejet/stable-diffusion.cpp/issues/325 by adding an argument to force a specific prediction mode. Some experimental models (such as [EasyFluff](https://huggingface.co/zatochu/EasyFluff) and similar models) may introduce different prediction modes...

I think it would be interesting to add a token downsampling implementation to the project as it's a method that improves the performance by a lot with a minimal quality...

This PR allows the ROCm build process to work on Linux when using a self built ROCm stack using [rocm_sdk_builder](https://github.com/lamikr/rocm_sdk_builder/).

I've noticed that #285 hasn't been updated for a while so I've upstreamed the PR and fixed the regex to correctly identify the BREAK token without deleting the white spaces....

I've found an archive of the original stable diffusion 1.5 model that doesn't require any login. The original upload made by Stability-AI has been removed so the archive made by...

I'm currently building the project for my laptop (Ryzen 4700U) but the integrated GPU is not officially supported. For now I've been able to successfuly build until rocBLAS but as...

This PR supersedes #11778. Here's the performance numbers on my Radeon RX 5700XT (RADV). `Vulkan`: ``` Master: IM2COL(type_input=f32,type_kernel=f16,dst_type=f32,ne_input=[32,32,256,1],ne_kernel=[3,3,256,1],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): 13104 runs - 95.82 us/run - 10244 kB/run - 101.96 GB/s IM2COL(type_input=f32,type_kernel=f16,dst_type=f32,ne_input=[64,64,256,1],ne_kernel=[3,3,256,1],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1):...

Vulkan
ggml

This PR is a continuation of the tests done in #11826 about the subgroup size in vulkan and its effects in performance especially on RDNA cards. For now it includes...

Vulkan
ggml