text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

How do I load a Character using the API mode?

Open Subarasheese opened this issue 1 year ago • 9 comments

Description

Hello,

I made a instructional character to do a certain task, but I cannot find anything in the documentation mentioning how to use characters on the API mode (the api-example.py and api-example-stream.py files).

I really need it, because when I enter my entire character context as an input message, it is very slow and if I don't trim it I run into CUDA out of memory errors.

Additional Context

I want to know if there is a parameters to make the textgen app load the model and the character when running though the "API" mode.

Subarasheese avatar Apr 22 '23 11:04 Subarasheese

use another front end like tavern or sillytavern, etc.

Ph0rk0z avatar Apr 22 '23 12:04 Ph0rk0z

use another front end like tavern or sillytavern, etc.

Is there a guide that tells me how to load a character from CLI parameters? Because I am building a Python script to automate a task for the LLM to do (I am not chatting with a "waifu"). The LLM in question is a 4bit quantized Vicuna.

Subarasheese avatar Apr 22 '23 15:04 Subarasheese

Maybe consider langchain? You can just load the character as part of the prompt. Put on verbose mode when you do it through the UI to see the structure of the prompt.

Ph0rk0z avatar Apr 22 '23 16:04 Ph0rk0z

For api-stream, you can hit the fn of the load_character function (I think it's 38?) and pass it data (character, name1, name2).

brandonj60 avatar Apr 22 '23 16:04 brandonj60

On the API code, I am doing something like this:

# Generation parameters
# Reference: https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig
params = 

{'max_new_tokens': 1025, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': [], 'name1': '### Human:', 'name2': '### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:', 'greeting': 'Please send a Stable Diffusion prompt so I can write down the reverse question that generated it.', 'context': 'Stable Diffusion reverse prompter is a virtual assistant that does not chat with the user and only send reverse prompts instead, targeting a main subject of the Stable Diffusion image prompt. It is never too descriptive about the image, it only pays attention to the main subject. For example, in a response for a prompt "photography, blond woman in a gold suit riding a bike,4k,trending on instagram, intricate, cyberpunk,outdoors setting", it would generate a question like: "Can you make me a Stable Diffusion prompt for a blond woman in a gold suit riding a bike?". Other prompts like "a portrait of a girl skull face, marilyn monroe, in the style of artgerm, charlie bowater, atey ghailan and mike mignola, vibrant colors and hard shadows and strong rim light, plain background, comic cover art, trending on artstation" would result in a response like "Write a Stable Diffusion prompt for a portrait of a girl skull face inspired by marilyn monroe.". This assistant provides replies with short requests focusing on a core subject without being too descriptive. It will only ever mention the subject and art medium.\n\n### Human:: "totem aztek tribal deepdream intricate, elegant, sharp focus, illustration, highly detailed, digital painting, concept art, matte, art by WLOP and Artgerm and Greg Rutkowski and Alphonse Mucha, masterpiece"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Can you give me a Stable Diffusion prompt for a totem aztek tribal?\n### Human:: "a portrait of a beautiful willa holland as a 1950s rockabilly greaser, art by lois van baarle and loish and ross tran and rossdraws and sam yang and samdoesarts and artgerm, digital art, highly detailed, intricate, sharp focus, trending on artstation hq, deviantart, unreal engine 5, 4k uhd image"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Write a Stable Diffusion prompt of a portrait of a beautiful willa holland as a 1950s rockabilly greaser.\n### Human:: "isometric chubby 3d game cannon, with detailed, clean, cartoon, octane render, unreal engine, artgerm, artstation"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Write a Stable Diffusion prompt of an isometric chubby 3d game cannon.\n### Human:: "the legendary island sized lion snake, made by Stanley Artgerm Lau, WLOP, Rossdraws, ArtStation, CGSociety, concept art, cgsociety, octane render, trending on artstation, artstationHD, artstationHQ, unreal engine, 4k, 8k,"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Create a Stable Diffusion prompt for The Legendary Island Sized Lion Snake.\n### Human:: "a comic potrait of a female necromamcer with big and cute eyes, fine - face, realistic shaded perfect face, fine details. night setting. very anime style. realistic shaded lighting poster by ilya kuvshinov katsuhiro, magali villeneuve, artgerm, jeremy lipkin and michael garmash, rob rey and kentaro miura style, trending on art station"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Write a Stable Diffusion prompt for a female necromamcer with big and cute eyes, on the style of comics.\n### Human:: "a simple micro-service deployed to a public cloud, security, attack vector, trending on Artstation, painting by Jules Julien, Leslie David and Lisa Frank, muted colors with minimalism"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Write a Stable Diffusion image prompt of a painting depicting simple micro-service being deployed to a public cloud.\n### Human:: "cybernetically enhanced cyborg hyena, realistic cyberpunk 2077 concept art"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: I want a Stable Diffusion prompt for a cyborg hyena concept art.\n### Human:: "beautiful, young woman, cybernetic, cyberpunk, detailed gorgeous face, flowing hair, vaporwave aesthetic, synthwave , digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Please create a Stable Diffusion prompt for a beautiful young woman with a cybernetic appearance in a cyberpunk theme, illustrated in a digital painting style.\n### Human:: "a fat ugly man, in the style of artgerm, gerald brom, atey ghailan and mike mignola, vibrant colors and hard shadows and strong rim light, plain background, comic cover art, trending on artstation"\n### Assistant: Sure! Here is my request with a very short image description, mentioning only the main subject and medium:: Write a Stable Diffusion image prompt that generates an image of a fat ugly man, on the style of comic art.\n', 'end_of_turn': '', 'chat_prompt_size': 2048, 'chat_generation_attempts': 1, 'stop_at_newline': False, 'mode': 'cai-chat'}


# Input prompt
prompt = "masterpiece, woman, cybernetic world, intricate, cyberpunk, illustration, concept art"

Which is using the parameters used by the Character mode, like context, name1, name2, greeting etc

However, it seems to be completely ignoring the context.

I simulated dialog with this:

# Input prompt
prompt = '''
### Human: "What did I ask previously? What do you do?"
### Assistant:'''

And here is what it responded:

### Assistant:You asked me to describe what a data warehouse is and how it differs from a database.

Which had nothing to do with what I asked, so I assume it is ignoring the context in the Payload.

How do I get it to stop ignoring the "context" parameter?

Subarasheese avatar Apr 22 '23 16:04 Subarasheese

This would be best put in Discussions instead of Issues.

TFWol avatar Apr 22 '23 21:04 TFWol

@TFWol I actually think there's a really useful feature request hiding in here, which is to be able to specify a Character programatically. It could be with a command-line arg, or a settings json; anything that would allow someone to stand up an instance of text-generation-webui without having to physically click through the gradio interface.

tensiondriven avatar Apr 23 '23 01:04 tensiondriven

@tensiondriven I agree it'd be a useful feature (if it's not possible somehow), but if that's the case the title wording should probably be tweaked a bit to reflect a request. I think it would help make things easier for the Dev to triage, but that's mostly my broader point of view from looking at the daunting number of Issues.

Then again, it is labeled as an enhancement, so I'm probably just being nit-picky.

TFWol avatar Apr 23 '23 15:04 TFWol

Yes, the title of the thread is a question since I did not know for sure this feature was possible, it seems it isn't, so I think it's valid to have a discussion about this as this would be a very important feature to have as even GPT3.5, which is tailored to be a chatbot model, has an API where you can define context and add "personality" to it, and characters from the Ooba gui follow the same principle

Subarasheese avatar Apr 23 '23 16:04 Subarasheese

I agree, this could be closed in favor of something more descriptive.

tensiondriven avatar Apr 23 '23 18:04 tensiondriven

@tensiondriven Before closing, can you create an issue suggesting the feature? It is counterproductive to close issues that bring valid discussion points just because they don't fit an expected issue pattern.

Subarasheese avatar Apr 23 '23 22:04 Subarasheese

@Subarasheese I was just suggesting you could edit the title. Is it locked?

TFWol avatar Apr 23 '23 23:04 TFWol

I've already tried to get changes like these into the main branch, but I think it's kind of low priority.

I've started splitting out smaller sections to try to get those merged, but if you want to look at example code on how to accomplish some of what you're looking to do here is my most recent PR #976 for adding the ability to load a character from the command line and here is my most recent PR #1250 that updated the existing Gradio API to allow for using characters with their context and logs and stuff intact.

Unfortunately a lot of the base has changed since I made these, so I recently closed both PRs to purse my new strategy, but they still might be of help to you as you try to figure stuff out.

bmoconno avatar Apr 24 '23 00:04 bmoconno

@bmoconno Perhaps it's fortunate, if those underlying changes are making the codebase more modular and decoupled. I really think the codebase is suffering from growing pains; Gradio is coupled to the models such that it can't exit cleanly when something crashes, the server.py file mixes concerns with model loading (those two are kind of the same thing), the API hooks are incomplete (in my opinion, of course) and overall it's difficult to know where a code change should live. I don't mean to be all-negatives here, but this is something I wanted to share.

Getting back to the issue at hand; it seems to be that all state should be saved to settings, including Character, and that state should be reloaded on browser refreshes as one would expect with a web app. Any setting would/could be settable/gettable via API, cli/settings file, and/or web-ui.

I don't think this would be too difficult to implement for the most commonly used settings, and once we had some kind of get/set settings api, new features could use it going forward. Then, old/existing features could be switched over opportunistically.

Anyhoo, I'm armchair-architecting now, so I'll stop. I did have a branch which mostly decoupled Gthe view (gradio) from the model code, but getting it PR-ready would have resulted in a pretty invasive PR and I was concerned that it wouldn't get merged, or that I wouldn't have time/energy to stay on top of change requests needed to get it merged.

I'm imagine Stable Diffusion went through similar growing pains early on. I wonder what we could learn from @automatic1111. Overall, I think moving TGWUI to a more modular/pluggable architecture would go a long way to making it less fragile and enable faster iteration, as changes could be reviewed in isolation. Ok, I'm really done now :)

@Subarasheese I dont have any special authority to close issues; though I would be happy to create a separate issue to track setting/persisting character.

If the API supports loading a settings file (which it does via command line, not sure about API), then we could support setting character via settings, and you could provide a settings file name in your API request; I think that would solve this for you too.

tensiondriven avatar Apr 24 '23 01:04 tensiondriven

@tensiondriven

Perhaps it's fortunate, if those underlying changes are making the codebase more modular and decoupled.

I definitely agree, I don't envy all the work Oobabooga is having to do trying to keep up with all these changes. It seems like there's a new "big thing" in LLMs every other day, it's gotta be impossible to know which way you're going or how to get there.

the API hooks are incomplete (in my opinion, of course) and overall it's difficult to know where a code change should live.

There was just a pretty big update to the non-Gradio API #990, maybe that will be a good framework for adding additional endpoints to change settings like you are suggesting here.

bmoconno avatar Apr 24 '23 01:04 bmoconno

Created https://github.com/oobabooga/text-generation-webui/issues/1508

I also learned in the process that I requested a similar feature about 1000 issues ago (30 days)! D'oh!

tensiondriven avatar Apr 24 '23 01:04 tensiondriven

@bmoconno I agree, I'll go one step farther and say I don't think it makes sense for @oobabooga to be expected to manage the whole project alone. If I were in his shoes, I'd rather be working on sexy features rather than managing an issue queue and asking people to squash their commits before merging.

I offered to help with the administrative stuff in an issue a few weeks ago, but it was closed with no response. I can respect that, its his project, he's BDFL, and I can always fork if I have a problem with it, though it still leaves open the question of how the project can grow and respond efficiently when everything is moving so quickly.

tensiondriven avatar Apr 24 '23 01:04 tensiondriven

Closing due to pointless discussion

oobabooga avatar Apr 24 '23 02:04 oobabooga