x4080

Results 181 comments of x4080

@junrushao Thanks for the documentation

@FSSRepo Using the latest build, I cannot build the server, how to do it ? M2 pro 1. I go to examples/server 2. cmake -DLLAMA_BUILD_SERVER=ON warning : ``` CMake Warning:...

@FSSRepo thanks I'll try it, not in front of my computer now Edit : It works, thanks. For future references ``` mkdir build cd build cmake .. -DLLAMA_BUILD_SERVER=ON cmake --build...

Can't wait for this, @ggerganov

@philipturner have you take a look at https://github.com/mlc-ai/web-llm especially https://github.com/mlc-ai/mlc-llm ? It can run apple silicon gpu pretty fast using webgpu. I tested it and it seems as fast as...

@philipturner yes, mlc llm is as fast as llama cpp, like I said above

Its about the same 😄 I was surprised as well, because I alread read about your work, btw I'm using M2 pro 16GB, and latest ventura dont know if that...

For llama.cpp, now i'm using 13b models, so its kinda slower for the model that I use for mlc (7b), yes the mlc dont have any speed displayed but it...

Using webgpu, this is what I got

The strange thing is for stable diffusion is not "that fast"