llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

"Press Ctrl+C to interject at any time" is a bad chice - use something else, maybe [Esc]

Open maddes8cht opened this issue 1 year ago • 8 comments

I think Ctrl+C is a bad choice to interrupt the output, because Ctrl+C also stops the program. It has happened to me that I pressed Ctrl+C at the wrong moment and a conversation was lost as a result. It has happened to me, I think it will have happened to others, it will happen again - and that is because Ctrl+C has this dual function. A better key for interrupting would be the [Esc] key in my opinion.

maddes8cht avatar Apr 24 '23 18:04 maddes8cht

no but it's the shell'

syl-00110111 avatar Apr 25 '23 08:04 syl-00110111

it's the shell ending the program on sigint. it's the developers decision to use sigint to give control back to the user. I think it should be done with aother key, which would mean not to use sigint.

if it stays with sigint, meaning ctrl+c, i think sigint should also be captured before ending the program with something like "exit ? (y/n or ctrl+c again)" at least that should be configurable with a switch.

maddes8cht avatar Apr 25 '23 09:04 maddes8cht

I think most of main.cpp has become a huge spagetti of terminal handling code. Should really be changed to some other kind of UI like ncurses.

SlyEcho avatar Apr 25 '23 11:04 SlyEcho

@SlyEcho its clearly never been about providing UIs to regular users though? Their focus has been about speed and optimizations which is absolutely best! Use kobold cpp for ui

BarfingLemurs avatar Apr 26 '23 09:04 BarfingLemurs

Actually, in some ways i prefer the simple console ui over the webUIs out there. While koboldcpp is somehow fast in output, it always feeds back in a lot of the oconversation, which makes it slow in prompt ingestion. So, a really performant console frontend does have it's worth and should receive some love in development.

maddes8cht avatar Apr 26 '23 11:04 maddes8cht

This has happened to me. If you hit Ctrl-C at just the right time you can get stuck waiting while the context is reset and reevaluated and then you'll start to question yourself. Maybe you didn't hold Ctrl down? Maybe you didn't press the C key hard enough? So you finally hit it again and then it exits the program.

I've submitted a couple pull requests for main that focus on the user experience, but they seem to have gone mostly ignored. I'm not sure if they're just focusing so hard on performance right now that they're taking an "if it ain't broke don't fix it" approach to main or if they really just want to keep it as simple at all possible costs.

I have one pull request that's been up for a couple weeks now that improves the user experience through more specific console input handling. And it does clean up main.cpp a bit, but at the same time it adds to the console handling code in common.cpp which may still be undesirable. It processes one key press at a time, which is the approach you'd need to interrupt on Esc or whatever key people wanted to use for interruption. So it's step towards that but I've gotten very little feedback and I'm hesitant to put more time into improving the user experience if it's not the desired direction for the project.

In the README.md that ggerganov wrote, he states that his goal was a "Plain C/C++ implementation without dependencies". So that sounds like it rules out ncurses. Either way, I'm not even mad if the goal is just to keep main as simple as possible, but I'm hoping the real reason it's going mostly ignored is that everyone with permission to review code and accept a pull request is just busy focusing on the performance stuff.

Still, it would be nice to know if the issue is just that they want to minimize the code that goes into the main example at all costs or if it's just not the priority at the moment. I could probably even submit a pull request for a separate cli ui that has a more polished end user experience if they're just worried about breaking the program they use for testing everything, but maybe that's not even desired?

With StableML and all these other open source LLMs being released, it might be best to just make a new project that tries to make an interface that can use any ggml supported text model. It's the 4-bit (and now 5-bit) quantization and focus on the end user devices that makes this project so exciting but all of the work being done here seems to be pushed back to the official ggml repo (with the exception of llama.cpp itself) and should, I believe (is this right?), also benefit all model running on ggml. So perhaps such a project would be a better point to focus on for end user experience issues.

DannyDaemonic avatar Apr 29 '23 07:04 DannyDaemonic

Thanks for the insights. Yes, it would be really nice to have a conso,le based interface for could support any ggml -supported llms. I would be glad to see something like this happen.

maddes8cht avatar Apr 29 '23 16:04 maddes8cht

Yes

I agree, using [Esc] key is a better option

vaishnavi-bhogi19 avatar Jun 04 '23 05:06 vaishnavi-bhogi19

Is there a way anymore to interrupt generation without killing the program? You used to be able to hit ctrl-c and make a new request without restarting the program and losing context, but sometime in the last month or two that was taken away, and ctrl-c only kills the program now.

I understand not wanting ctrl-c to be the chosen key, but having the ability at all is pretty important.

tjkirch avatar Apr 01 '24 04:04 tjkirch

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar May 18 '24 01:05 github-actions[bot]

Not stale. Bad bot.

tjkirch avatar May 18 '24 02:05 tjkirch