text-generation-webui Show the total number of tokens and generation speed in chat UI (#2243)

Checklist:

[x] I have read the Contributing guidelines.

What's been done:

Two small changes:

one to modules/text_generation.py to expose the number of tokens that was identified as "context" and "generated output" as per last call of generate_reply_HF / generate_reply_custom
another is to modules/ui_chat.py to show that number of tokens similar way how it's being shown for "Default" tab

Jan 23 '24 15:01 kha84

Jan 23 '24 22:01 kha84

Added token per second to be displayed, still WIP Screenshot from 2024-01-29 00-01-49

Jan 28 '24 21:01 kha84

Need to polish Gradio part - don't like the place, where it is being displayed currently.

Jan 28 '24 21:01 kha84

Why tho? It's already being displayed on the cmd window. Adding it to the UI would be unnecessary unless you using it through the API in which case it would helpful. But other than that, i don't see why people would want to know the speed and tokens if they are just interested in the output text itself.

However if you are determine to add it, i think the best placement would be somewhere here. The current placement is in the way of the user's vision and it might impede on the text they want to read.

Feb 09 '24 19:02 YakuzaSuske

Yeah, I was thinking about that placement as well, thanks!

Answering your "why" question:

having this crucial info hidden somewhere in logs doesn't seem to be very handy. In order to see these figures you need to jump back and forth between web ui and cli:

For instance, ones might want to benchmark different quants or formats for the same model to see how fast are they compared one to another. With having this setting exposed to UI you'll be able to do everything there.
I also see lot of people are publicly sharing their "tps" figures just out of wild guess, without even knowing where to look at them. This should help them as well.

the amount of context used is quite important runtime parameter, even for chat-only casual users. For now you have to blind-guess if your chats are already out of the context window or not yet. Or to jump to CLI to see logs. Having it exposed to be displayed all the time gives you much more details about whether or not you should be still expecting coherent answers from your model.

Feb 09 '24 20:02 kha84

I think for me in my case it just seems kind of odd but.... tbh i always have the command window open and to the side. So i guess that's why i asked the question. I thought people always did that too but 🤷‍♂️ I guess i'm the only one who does that, so that's why i found it particularly odd since i can always at all times see the log, tokens a second, context, etc. by just glancing to my left real quick.

As seen below:

Feb 10 '24 07:02 YakuzaSuske

Why not put it under or next to the {{char}} icon like SillyTavern does?

number of tokens and duration.

I think this is a good addition to the UI as I don't have the command window showing.

Feb 12 '24 13:02 biship

@YakuzaSuske not everyone interacts on the same computer it's running on, I also use it on my phone, sometimes I'm curious and I'd rather not open terminal, ssh, find the program, watch the logs, and then swap between programs

I think this is a great inclusion and would love to see it implemented

Feb 15 '24 17:02 bartowski1182

@YakuzaSuske not everyone interacts on the same computer it's running on, I also use it on my phone, sometimes I'm curious and I'd rather not open terminal, ssh, find the program, watch the logs, and then swap between programs

I think this is a great inclusion and would love to see it implemented

This is the correct answer for "why". There are many instances (remote access, runpod) where you don't always get to see terminal.

Apr 21 '24 14:04 FartyPants

I find it cumbersome to see this kind of logs in the UI, while in the terminal the logs appear all on top of each other nicely.

Jun 26 '24 02:06 oobabooga

Sure, you're the boss. Myself I switched to ollama + Open Web UI = it has this info printed in UI out-of-the-box

Jul 07 '24 08:07 kha84

text-generation-webui text-generation-webui copied to clipboard

Show the total number of tokens and generation speed in chat UI (#2243)

Checklist:

What's been done:

text-generation-webui
text-generation-webui copied to clipboard