ddpasa

Results 42 comments of ddpasa

I managed to compile ollama with the following code snippet gen_linux.sh and it builds a vulkan version: ``` OLLAMA_CUSTOM_CPU_DEFS="-DLLAMA_VULKAN=1 -DLLAMA_AVX=on -DLLAMA_AVX2=on -DLLAMA_AVX512=on -DLLAMA_FMA=on -DLLAMA_AVX512_VBMI=on -DLLAMA_AVX512_VNNI=on -DLLAMA_F16C=on -DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=on" go generate...

I was able to get llama.cpp compiled with the following, and confirm that it works. However, when I try to hack [gen_commons.sh](https://github.com/ollama/ollama/blob/main/llm/generate/gen_common.sh#L85), I always get empty or grabled output. I'm...

@erogol I'm also having this exact same issue. All I run is: `from TTS.api import TTS` and I get: ```--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1...

Nevermind this happens because I'm running it in the same folder the github is cloned into. Changing the directory name fixes it.

https://github.com/ollama/ollama/pull/2578

> Vulkan support would be wonderful. I have an AMD RX 6800 XT, and using KoboldCPP with Vulkan support made using LLMs go from slow-but-kinda-usable to near-instantaneous speeds. My desktop...

> > @ProjectMoon I have experimental vulkan support here: #2578 > > I tried it out! Unfortunately, ollama crashes when it tries to run a model, with some kind of...

Hello @dhiltgen , thanks for your quick reply and detailed explanation. As you suggested, I recompiled ollama from source (it was really easy!) with the following flags: OLLAMA_CUSTOM_CPU_DEFS="-DLLAMA_AVX=on -DLLAMA_AVX2=on -DLLAMA_AVX512=on...

> That's significant! > > So just to clarify, before making this change, on your system we load the "cpu_avx2" variant, and your llava scenario took 8 minutes. With this...

I conducted all tests with the Bakllava model here: https://ollama.ai/library/bakllava, using the same seed=100. Details are below. The magic seems to be in VNNI. AVX512 helps a little, but it's...