text-generation-inference Explain the generation parameters in the benchmarking utility

Feature request

Improved README.md for the benchmarking utility that explains the different command line arguments.

Motivation

The benchmarking tool is awesome, I would just like to have some additional information about the command line parameters. Specifically, sequence-length and decode-length. I think that I know what these parameters mean in the context of this project, but it would be helpful to have a description so that people can relate the benchmark performance numbers to real-world usage. Having more detailed descriptions of the parameters displayed by the benchmarking utility would also be helpful, specially the difference between prefill and decode performance.

Your contribution

I would be happy to help write some documentation if someone could provide more detailed descriptions of these parameters and how they relate to model inference.

Jun 06 '23 18:06 Blair-Johnson

Adding real rustdoc to the clap args here: https://github.com/huggingface/text-generation-inference/blob/main/benchmark/src/main.rs#L16

Should be plenty enough documentation. It automatically documents the cli itself ( -h ), the rustdoc, and for the readme we could simply tell users to use -h for advanced usage.

Jun 06 '23 21:06 Narsil

That would be a great start. It would also be nice to document the actual metrics that are captured in the output of the benchmark. What is actually being measured, etc.

Jun 07 '23 15:06 Blair-Johnson

~~Accidental close~~

Jun 07 '23 15:06 Blair-Johnson

Are you willing to open a PR for it ?

Jun 08 '23 08:06 Narsil

Yes, I just need some explanation of the different measurements and parameters.

Jun 09 '23 17:06 Blair-Johnson

Hi @Blair-Johnson Could you please explain sequence-length and decode-length for me ? I find they are confusing. Can I roughly consider as input length and output length ? Thanks !

Jun 15 '23 11:06 Tracin

Sorry missed your reaction @Blair-Johnson . I created a PR with a first draft, feel free to ask questions so we can make those even clearer.

Jun 15 '23 14:06 Narsil

@Narsil No worries, thank you for your work! I'll update your PR with some questions. @Tracin Take a look at the files changed in the PR for Narsil's descriptions of those parameters, I think your description is probably accurate, but I have a few followup questions that I'll ask there.

Jun 15 '23 15:06 Blair-Johnson

text-generation-inference text-generation-inference copied to clipboard

Explain the generation parameters in the benchmarking utility

Feature request

Motivation

Your contribution

text-generation-inference
text-generation-inference copied to clipboard