text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

one server, multiple sessions (users) ((feature request))

Open ThatCoffeeGuy opened this issue 1 year ago • 9 comments

Hey guys,

First of all, thanks a bunch for working on this project, I'm following closely. Hoping I can contribute with something one day.

My question is: Is there a way to run non-shared sessions using a single instance? When I shared my instance on the local network, chat-history was present on multiple computers at a point, they were both using the same character.

I suppose the most obvious technical challenge would be the fast-switch of context and history especially if two users are talking to two different characters.

Thank you!

ThatCoffeeGuy avatar Feb 05 '23 18:02 ThatCoffeeGuy

This is related to #13 (who asked the opposite). As it is now, the state implementation is indeed poor, as it neither completely remembers or completely forgets the state when the web ui is opened in another tab.

oobabooga avatar Feb 08 '23 04:02 oobabooga

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, you can reopen it (if you are the author) or leave a comment below.

github-actions[bot] avatar Mar 13 '23 23:03 github-actions[bot]

I believe it is still a valid enhancement request. I am aware that I could try building my own platform and use textgen webui only as inference, but I'd love to see it as a built in feature as well.

Can @oobabooga reopen it? I might be blind but I can't see where to do it.

ThatCoffeeGuy avatar Mar 14 '23 18:03 ThatCoffeeGuy

FWIW, I'm not able to run multiple instances on different ports; when I try, the second instance kills with no error message:

(with one instance already running)

$ python server.py --cai-chat --gptq-bits 4 --model llama-13b --listen-port 8002 --verbose
Loading llama-13b...
Loading model ...
Killed

Could it be as easy as changing some command line parameters, or somehow isolating the instances in Docker containers? (See #174)

tensiondriven avatar Mar 19 '23 19:03 tensiondriven

I am perfectly able to run several instances using docker, so that is something you can try too.

MarlinMr avatar Mar 26 '23 09:03 MarlinMr

I am perfectly able to run several instances using docker, so that is something you can try too.

But wouldn't that result in multiple instance being loaded at the same time, therefore requiring more resources? What I'm asking here is the possibility for multiple users to interact with the same, loaded model.

ThatCoffeeGuy avatar Mar 30 '23 07:03 ThatCoffeeGuy

Yes, it's not efficient for a multi user setup, I was just commenting on the general idea by the guy above me

MarlinMr avatar Mar 30 '23 07:03 MarlinMr

I'm also interested in this enhancement - it seems like we need to apply a session state to the gradio-web ui according to gradio documentation

  • discussion on multiple user support from gradio: https://github.com/gradio-app/gradio/issues/3541
  • documentation: https://gradio.app/interface-state/#session-state

ktl014 avatar May 01 '23 17:05 ktl014

I think this would also involve sending incoming generate_reply calls (requests for text generation) to a queue to be processed by a background worker, which can then schedule them for execution so no conflicts from multiple requests at the same time. Unless I'm reading the code wrong and it already works like that

flurb18 avatar May 11 '23 04:05 flurb18

forgive my boldness. Is it possible to just change shared.history to something like shared.history[user_id] to support this? So that the historys are seperate.

jinluyang avatar May 17 '23 11:05 jinluyang

I am about to write advanced API to process multi-users (not UI though). Will revert.

thusinh1969 avatar May 19 '23 07:05 thusinh1969

Since the code uses gradio Blocks, what we want is this: https://gradio.app/state-in-blocks/

I'm taking a crack at it.

flurb18 avatar May 24 '23 17:05 flurb18

Hi All,

I would like to know if there is any progress on this request. Would love to know if we have any parameter available for state management

mohdimran043 avatar May 25 '23 04:05 mohdimran043

+1

surak avatar May 27 '23 12:05 surak

Since the code uses gradio Blocks, what we want is this: https://gradio.app/state-in-blocks/

I'm taking a crack at it.

@flurb18 good luck! I think the difficulty here is there the shared parameters are all over the places. If you want to support multiple users, you need to give each of them a gradio State object for a history, but make sure that for each button press (generate, submit, w/e) you 1. pass your state to the callback and 2. follow the callback all the way down to see that the shared one is not used in any later point..

I tried something like this, with partial success and it is quite messy.. I'm hoping there's a simpler solution out there..

OrenRele avatar May 30 '23 06:05 OrenRele

Could we put a little more priority on this feature? CC @oobabooga

ktl014 avatar Jun 02 '23 03:06 ktl014

Hi, I made a draft PR here. I agree @OrenRele and @jinluyang's opinions cause the project is using shared object. https://github.com/oobabooga/text-generation-webui/pull/2573

HyeongminMoon avatar Jun 08 '23 05:06 HyeongminMoon

See here https://github.com/oobabooga/text-generation-webui/pull/2573#issuecomment-1620863925, it seems to me that this should already work, but it would be good to have someone actually test it with multiple users

Just launch the ui in chat mode with --multi-user flag and see if anything weird happens

oobabooga avatar Jul 05 '23 00:07 oobabooga

Anyone can confirm it works yet?

sandwichdoge avatar Jul 26 '23 02:07 sandwichdoge

WebUI works, API works but only one user at a time due to no batched inferencing. Everybody has to queue.

sandwichdoge avatar Jul 28 '23 04:07 sandwichdoge

See here #2573 (comment), it seems to me that this should already work, but it would be good to have someone actually test it with multiple users

Just launch the ui in chat mode with --multi-user flag and see if anything weird happens

This doesn't save the history right?

muhammad-asn avatar Aug 01 '23 04:08 muhammad-asn

WebUI works, API works but only one user at a time due to no batched inferencing. Everybody has to queue.

How it works? I try in incognito and default still the Web UI shows the same prompt?

muhammad-asn avatar Aug 01 '23 06:08 muhammad-asn

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] avatar Dec 08 '23 23:12 github-actions[bot]

Chiming in that I'd be really interested to see this, even in a basic form so two or three people can use the service but only see their own prompts.

Kosyne avatar Feb 12 '24 10:02 Kosyne