text-generation-webui
text-generation-webui copied to clipboard
generate_reply not running normally
In server.py why I am not able to call generate_reply() normally
I have checked in modules/text_generation.py the function generate_reply is defined which is taking 2 parameters question and state and other 2 are optional.
I have defined a variable: tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140} Now In server.py when I am running generate_reply("My Question Here?", tmp_state) with giving 2 parameters nothing is happening, even the function is not getting called.
How to call generate_reply normally?
The generate_reply()
function is defined in the modules/text_generation.py
file, so to call it from server.py
, you need to import the function first.
At the beginning of the server.py
file, add the following import statement:
from modules.text_generation import generate_reply
Now you can call the generate_reply()
function using the two required parameters, question
and state
. However, the state
parameter should be a string, not a dictionary. To fix this, you can convert the tmp_state
dictionary into a JSON string:
import json
tmp_state_str = json.dumps(tmp_state)
Then, you can call the generate_reply()
function:
response = generate_reply("My Question Here?", tmp_state_str)
Make sure you've already loaded the model and tokenizer before calling the generate_reply()
function. If the model and tokenizer are not loaded, you won't get any response.
Here's the complete example:
from modules.text_generation import generate_reply
from modules.models import load_model
import json
# Load the model and tokenizer
shared.model, shared.tokenizer = load_model(shared.model_name)
# Define the state as a JSON string
tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140}
tmp_state_str = json.dumps(tmp_state)
# Call the generate_reply() function
response = generate_reply("My Question Here?", tmp_state_str)
# Print the generated response
print(response)
This example assumes you have the shared
object with the model_name
attribute set. If you're testing this outside the main server code, you may need to modify the example to load the model and tokenizer correctly.
Now, when you call generate_reply()
, it should return the generated text based on the input question and the specified state.
The
generate_reply()
function is defined in themodules/text_generation.py
file, so to call it fromserver.py
, you need to import the function first.At the beginning of the
server.py
file, add the following import statement:from modules.text_generation import generate_reply
Now you can call the
generate_reply()
function using the two required parameters,question
andstate
. However, thestate
parameter should be a string, not a dictionary. To fix this, you can convert thetmp_state
dictionary into a JSON string:import json tmp_state_str = json.dumps(tmp_state)
Then, you can call the
generate_reply()
function:response = generate_reply("My Question Here?", tmp_state_str)
Make sure you've already loaded the model and tokenizer before calling the
generate_reply()
function. If the model and tokenizer are not loaded, you won't get any response.Here's the complete example:
from modules.text_generation import generate_reply from modules.models import load_model import json # Load the model and tokenizer shared.model, shared.tokenizer = load_model(shared.model_name) # Define the state as a JSON string tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140} tmp_state_str = json.dumps(tmp_state) # Call the generate_reply() function response = generate_reply("My Question Here?", tmp_state_str) # Print the generated response print(response)
This example assumes you have the
shared
object with themodel_name
attribute set. If you're testing this outside the main server code, you may need to modify the example to load the model and tokenizer correctly.Now, when you call
generate_reply()
, it should return the generated text based on the input question and the specified state.
I tried this but still I am not able to call generate_reply from server.py
I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"
I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"
It seems like the generate_reply()
function is a generator function, which is why you're getting a generator object when you call it. To get the actual response from the generator, you need to use the next()
function. Here's how you can modify the code:
Replace this line:
response = generate_reply("My Question Here?", tmp_state_str)
With:
response_generator = generate_reply("My Question Here?", tmp_state_str)
response = next(response_generator)
Now, when you call generate_reply()
, it should return the generated text based on the input question and the specified state.
Keep in mind that the generator might raise a StopIteration
exception if it doesn't have any more items to return. You can handle this exception if needed, but in this case, it shouldn't be an issue since you're only calling next()
once.
I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"
It seems like the
generate_reply()
function is a generator function, which is why you're getting a generator object when you call it. To get the actual response from the generator, you need to use thenext()
function. Here's how you can modify the code:Replace this line:
response = generate_reply("My Question Here?", tmp_state_str)
With:
response_generator = generate_reply("My Question Here?", tmp_state_str) response = next(response_generator)
Now, when you call
generate_reply()
, it should return the generated text based on the input question and the specified state.Keep in mind that the generator might raise a
StopIteration
exception if it doesn't have any more items to return. You can handle this exception if needed, but in this case, it shouldn't be an issue since you're only callingnext()
once.
As @avatarproject123 said he/she have added a print statement but I don't think that we are even going to the function as the print statement didn't get triggered
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.