llama.cpp
llama.cpp copied to clipboard
Support -ins for alpaca model in tcp server
Change multiple printf to fprintf(outstream
@vonjackustc can you change the target branch to tcp_server
?
I did same thing but in windows, here socket stream is a nice idea, but I use thread ThreadSafeQueueto impl it.
@vonjackustc I missed these new extra printf
statements in one of the recent rebases, just integrated your changes to the tcp_server
branch, thanks for catching it.
@vonjackustc I missed these new extra
printf
statements in one of the recent rebases, just integrated your changes to thetcp_server
branch, thanks for catching it.
You can change LLAMA_N_PARTS from { 5120, 2 } to { 5120, 1 } to support quantized alpaca-13b-q4.bin here: https://github.com/antimatter15/alpaca.cpp#getting-started-13b But it would lose compatibility with original llama. Maybe you can make it configurable :D
You can change LLAMA_N_PARTS from { 5120, 2 } to { 5120, 1 } to support quantized alpaca-13b-q4.bin here: https://github.com/antimatter15/alpaca.cpp#getting-started-13b But it would lose compatibility with original llama. Maybe you can make it configurable :D
I have no idea what these parameters mean, but isn't this what the --n_parts
parameter does?