We currently have a script for benchmarking our text generation utility functions: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/benchmarks/text_generation.py. However, since we intend to move to the samplers API, a benchmarking script for this will be useful, i.e., the gpt2_lm.generate(...) method: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/gpt2/gpt2_causal_lm.py#L337.

Feb 27 '23 19:02 abheesht17

I would like to take this up if it is available. Thanks a lot!

Feb 27 '23 19:02 atharvapurdue

We'll need inputs from @chenmoneygithub here!

Feb 27 '23 19:02 abheesht17

@atharvapurdue Thanks for your interest! Feel free to create a colab to get familiar with the sampler APIs and accumulate some stats. We are still making a few changes to the sampler API, so you can also wait for a while before starting working on this.

Feb 28 '23 03:02 chenmoneygithub

@chenmoneygithub , Sure! Thanks!

Feb 28 '23 03:02 atharvapurdue

@abheesht17 Here's a simple benchmarking script that you can use to measure the performance of the gpt2_lm.generate() method: import time from keras_nlp.models.gpt2 import gpt2_causal_lm

Set up the GPT-2 model

model = gpt2_causal_lm.GPT2CausalLM()

Set up the inputs

input_text = "The quick brown fox jumps over the lazy dog"

Set up the parameters

num_steps = 1000 batch_size = 1

Generate the text

start_time = time.time() for _ in range(num_steps): model.generate(input_text, batch_size=batch_size) end_time = time.time()

Calculate the average time per step

avg_time_per_step = (end_time - start_time) / num_steps

Print the result

print(f"Avg time per step: {avg_time_per_step:.6f} seconds")

This script sets up the GPT-2 model and input text, and then generates text using the gpt2_lm.generate() method. It repeats this process for a specified number of steps and measures the total time taken. Finally, it calculates the average time per step and prints the result.

You can adjust the num_steps and batch_size parameters to suit your needs. A larger num_steps value will give you a more accurate measurement of the performance, but will also take longer to run. Similarly, a larger batch_size value will generate more text in each step, but will also consume more memory.