llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Suppress output that isn't from the model

Open MLTQ opened this issue 1 year ago • 2 comments

I want to integrate this into a slim chat system, so I think it would be nice to be able to have the app output only the text from the model like a -q for "quiet" flag on run.

MLTQ avatar Mar 11 '23 01:03 MLTQ

for testing purposes, you can just comment out all printfs in main.cpp, run make again, and run ./main

tinoargentino avatar Mar 12 '23 08:03 tinoargentino

Another option is just using fprintf(stderr, ...) everywhere except in that one particular case that prints tokens from the model. According to POSIX standard, stderr is not limited for errors, but can be used to print diagnostic output too.

One can use ./main ... 2>dev/null to suppress any diagnostic output

PR https://github.com/ggerganov/llama.cpp/pull/48 does exactly this.

prusnak avatar Mar 12 '23 13:03 prusnak