SHARK (Studio) Llama2 UI tweaks

[x] Currently, the chatbot UI flashes aggressively when generating responses. Disable this in gradio.
[x] Also, the chatbot UI does not fit to window height/width -- this makes long responses a pain to read.
[ ] Most chatbot users don't actually need the past key values and use the same "session" for unrelated topics, so add a one-shot checkbox that, when enabled, clears the history when you submit a new prompt. This reduces the chance that people unintentionally load in a very large set of past key values when it's unnecessary, which affects performance.

Nov 16 '23 18:11 monorimet

For anyone who picks this up, if I don't get to it at some point:

show_progress="none" is probably what want you'll want to set on whichever gradio events are triggering this. I'm guessing that would be whatever is attached to the submit button, and maybe enter on the box where you type stuff in. See the event for the generate buttons on the SD tabs.
The way this is done for the output gallery is to have a CSS class that sets min-height: in viewport height (vh) units, on the appropriate control. But you'll have two problems you'll need to solve if you take that approach.

First, finding the appropriate CSS selectors to get to whatever the appropriate control is. Given the way gradio works, this probably isn't as simple as just setting an elem_class=<your_classname> and defining <your classname> to add the height directly at that node. The output gallery has to drill down from there to some descendant node.

Second, unlike the output gallery the chatbot has stuff above and below it, which can change size vertically as things wrap and and unwrap based on the browsers horizontal sizing, this means that a fixed min-height may not work, because the percentage of the viewport height you want may not be stable.
I don't know how the chatbot works currently, but I assume the user expectation would be that it would work like ChatGPT with a selectable list of past conversions, and the ability to start new ones. That's probably out of scope for this though.

Nov 16 '23 23:11 one-lithe-rune

Thanks for offering some ideas. I've merged 1b11c82c9d98e10172a7fbd988cef157493768e9 which accomplishes the first two points. As for the third, a docu-chat/"save conversation" option would probably be the best -- if we have an option for one-shot inference it would be good to support the other end of the spectrum, where we have lots of context we want the model to use.

Nov 21 '23 00:11 monorimet