x4080 comments

Results 181 comments of


                                            x4080

Question : Prompt configuration

@junrushao Thanks for the documentation

[Feature] Is there a way to get response with embeddings as input?

@FSSRepo Using the latest build, I cannot build the server, how to do it ? M2 pro 1. I go to examples/server 2. cmake -DLLAMA_BUILD_SERVER=ON warning : ``` CMake Warning:...

[Feature] Is there a way to get response with embeddings as input?

@FSSRepo thanks I'll try it, not in front of my computer now Edit : It works, thanks. For future references ``` mkdir build cd build cmake .. -DLLAMA_BUILD_SERVER=ON cmake --build...

Apple Silicon GPU Support Possible?

Can't wait for this, @ggerganov

Apple Silicon GPU Support Possible?

@philipturner have you take a look at https://github.com/mlc-ai/web-llm especially https://github.com/mlc-ai/mlc-llm ? It can run apple silicon gpu pretty fast using webgpu. I tested it and it seems as fast as...

Apple Silicon GPU Support Possible?

@philipturner yes, mlc llm is as fast as llama cpp, like I said above

Apple Silicon GPU Support Possible?

Its about the same 😄 I was surprised as well, because I alread read about your work, btw I'm using M2 pro 16GB, and latest ventura dont know if that...

Apple Silicon GPU Support Possible?

For llama.cpp, now i'm using 13b models, so its kinda slower for the model that I use for mlc (7b), yes the mlc dont have any speed displayed but it...

Apple Silicon GPU Support Possible?

Using webgpu, this is what I got

Apple Silicon GPU Support Possible?

The strange thing is for stable diffusion is not "that fast"