text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

generate_reply not running normally

Open avatarproject123 opened this issue 1 year ago • 5 comments

In server.py why I am not able to call generate_reply() normally

I have checked in modules/text_generation.py the function generate_reply is defined which is taking 2 parameters question and state and other 2 are optional.

I have defined a variable: tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140} Now In server.py when I am running generate_reply("My Question Here?", tmp_state) with giving 2 parameters nothing is happening, even the function is not getting called.

How to call generate_reply normally?

avatarproject123 avatar May 04 '23 11:05 avatarproject123

The generate_reply() function is defined in the modules/text_generation.py file, so to call it from server.py, you need to import the function first.

At the beginning of the server.py file, add the following import statement:

from modules.text_generation import generate_reply

Now you can call the generate_reply() function using the two required parameters, question and state. However, the state parameter should be a string, not a dictionary. To fix this, you can convert the tmp_state dictionary into a JSON string:

import json

tmp_state_str = json.dumps(tmp_state)

Then, you can call the generate_reply() function:

response = generate_reply("My Question Here?", tmp_state_str)

Make sure you've already loaded the model and tokenizer before calling the generate_reply() function. If the model and tokenizer are not loaded, you won't get any response.

Here's the complete example:

from modules.text_generation import generate_reply
from modules.models import load_model
import json

# Load the model and tokenizer
shared.model, shared.tokenizer = load_model(shared.model_name)

# Define the state as a JSON string
tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140}
tmp_state_str = json.dumps(tmp_state)

# Call the generate_reply() function
response = generate_reply("My Question Here?", tmp_state_str)

# Print the generated response
print(response)

This example assumes you have the shared object with the model_name attribute set. If you're testing this outside the main server code, you may need to modify the example to load the model and tokenizer correctly.

Now, when you call generate_reply(), it should return the generated text based on the input question and the specified state.

mironkraft avatar May 04 '23 13:05 mironkraft

The generate_reply() function is defined in the modules/text_generation.py file, so to call it from server.py, you need to import the function first.

At the beginning of the server.py file, add the following import statement:

from modules.text_generation import generate_reply

Now you can call the generate_reply() function using the two required parameters, question and state. However, the state parameter should be a string, not a dictionary. To fix this, you can convert the tmp_state dictionary into a JSON string:

import json

tmp_state_str = json.dumps(tmp_state)

Then, you can call the generate_reply() function:

response = generate_reply("My Question Here?", tmp_state_str)

Make sure you've already loaded the model and tokenizer before calling the generate_reply() function. If the model and tokenizer are not loaded, you won't get any response.

Here's the complete example:

from modules.text_generation import generate_reply
from modules.models import load_model
import json

# Load the model and tokenizer
shared.model, shared.tokenizer = load_model(shared.model_name)

# Define the state as a JSON string
tmp_state = {'max_new_tokens': 200, 'seed': -1.0, 'temperature': 0.7, 'top_p': 0.5, 'top_k': 40, 'typical_p': 1, 'repetition_penalty': 1.2, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'min_length': 0, 'do_sample': True, 'penalty_alpha': 0, 'num_beams': 1, 'length_penalty': 1, 'early_stopping': False, 'add_bos_token': True, 'ban_eos_token': False, 'truncation_length': 2048, 'custom_stopping_strings': '', 'skip_special_tokens': True, 'preset_menu': 'Default', 'cpu_memory': '20200', 'auto_devices': False, 'disk': False, 'cpu': False, 'bf16': False, 'load_in_8bit': False, 'wbits': 4, 'groupsize': 128, 'model_type': 'llama', 'pre_layer': 0, 'gpu_memory_0': 6140}
tmp_state_str = json.dumps(tmp_state)

# Call the generate_reply() function
response = generate_reply("My Question Here?", tmp_state_str)

# Print the generated response
print(response)

This example assumes you have the shared object with the model_name attribute set. If you're testing this outside the main server code, you may need to modify the example to load the model and tokenizer correctly.

Now, when you call generate_reply(), it should return the generated text based on the input question and the specified state.

I tried this but still I am not able to call generate_reply from server.py

avatarproject123 avatar May 05 '23 06:05 avatarproject123

I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"

avatarproject123 avatar May 05 '23 07:05 avatarproject123

I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"

It seems like the generate_reply() function is a generator function, which is why you're getting a generator object when you call it. To get the actual response from the generator, you need to use the next() function. Here's how you can modify the code:

Replace this line:

response = generate_reply("My Question Here?", tmp_state_str)

With:

response_generator = generate_reply("My Question Here?", tmp_state_str)
response = next(response_generator)

Now, when you call generate_reply(), it should return the generated text based on the input question and the specified state.

Keep in mind that the generator might raise a StopIteration exception if it doesn't have any more items to return. You can handle this exception if needed, but in this case, it shouldn't be an issue since you're only calling next() once.

mironkraft avatar May 05 '23 09:05 mironkraft

I have added a print statement is starting of generate_reply function. So that I cam to know we have entered the function but when calling the function nothing is happening instead in response I got this value "<generator object generate_reply at 0x0000020478664270>"

It seems like the generate_reply() function is a generator function, which is why you're getting a generator object when you call it. To get the actual response from the generator, you need to use the next() function. Here's how you can modify the code:

Replace this line:

response = generate_reply("My Question Here?", tmp_state_str)

With:

response_generator = generate_reply("My Question Here?", tmp_state_str)
response = next(response_generator)

Now, when you call generate_reply(), it should return the generated text based on the input question and the specified state.

Keep in mind that the generator might raise a StopIteration exception if it doesn't have any more items to return. You can handle this exception if needed, but in this case, it shouldn't be an issue since you're only calling next() once.

As @avatarproject123 said he/she have added a print statement but I don't think that we are even going to the function as the print statement didn't get triggered

ashmitsharma avatar May 05 '23 10:05 ashmitsharma

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions[bot] avatar Jun 04 '23 23:06 github-actions[bot]