DeepSpeed
DeepSpeed copied to clipboard
Increasing the token-length based on available memory for GPT models
Running the new PR queries = ["cat " * 2000]*4 for max_new_tokens = 10, generated_tokens = [10,10,10,10] for max_new_tokens = 100, generated_tokens = [100,100,100,99] for 300 -> [299, 300, 299, 297] for 1500 -> [1500, 1276, 1500, 369
Hi, any update on the above @RezaYazdaniAminabadi ^^? Were you able to find the error?
Hi @mayank31398,
Looking into it right now, let me first merge this to another PR. I will let you know. Thanks, Reza