Add text generation benchmark for the new `samplers` API
We currently have a script for benchmarking our text generation utility functions: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/benchmarks/text_generation.py. However, since we intend to move to the samplers API, a benchmarking script for this will be useful, i.e., the gpt2_lm.generate(...) method: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/gpt2/gpt2_causal_lm.py#L337.
I would like to take this up if it is available. Thanks a lot!
We'll need inputs from @chenmoneygithub here!
@atharvapurdue Thanks for your interest! Feel free to create a colab to get familiar with the sampler APIs and accumulate some stats. We are still making a few changes to the sampler API, so you can also wait for a while before starting working on this.
@chenmoneygithub , Sure! Thanks!
@abheesht17 Here's a simple benchmarking script that you can use to measure the performance of the gpt2_lm.generate() method: import time from keras_nlp.models.gpt2 import gpt2_causal_lm
Set up the GPT-2 model
model = gpt2_causal_lm.GPT2CausalLM()
Set up the inputs
input_text = "The quick brown fox jumps over the lazy dog"
Set up the parameters
num_steps = 1000 batch_size = 1
Generate the text
start_time = time.time() for _ in range(num_steps): model.generate(input_text, batch_size=batch_size) end_time = time.time()
Calculate the average time per step
avg_time_per_step = (end_time - start_time) / num_steps
Print the result
print(f"Avg time per step: {avg_time_per_step:.6f} seconds")
This script sets up the GPT-2 model and input text, and then generates text using the gpt2_lm.generate() method. It repeats this process for a specified number of steps and measures the total time taken. Finally, it calculates the average time per step and prints the result.
You can adjust the num_steps and batch_size parameters to suit your needs. A larger num_steps value will give you a more accurate measurement of the performance, but will also take longer to run. Similarly, a larger batch_size value will generate more text in each step, but will also consume more memory.
Is this issue open for contribution if yes i would like contribute
This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for more than 1 year.