TensorRT-LLM
TensorRT-LLM copied to clipboard
benchmarking: docs reference steps that don't exist
System Info
- System independent. This issue is re: docs
- In the benchmarking page there are multiple references to build.py scripts that don't exist as far as I can tell:
-
python examples/llama/build.py
-
python examples/falcon/build.py
-
Context is that I'm trying to replicate your benchmarking results for LLama-2 on various machines
I could instead follow the instructions in this readme https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/llama/README.md#llama but there are 2 potential issues with this:
- It's not clear how the args to the example/llama/build.py script (e.g.
--fp8_kv_cache
map to those in theconvert_checkpoint.py
flow if at all) meaning I can't be certain I'll be replicating your results - It's possible the flow is different when building from source. I'm building from source in this case as recommended in order to run the benchmarks
Who can help?
@Shixiaowei02 who made recent changes on that docs page: https://github.com/NVIDIA/TensorRT-LLM/commit/e093e484598655c4d7c6c91fab4fb0e55c56bbd3 @kaiyux re performance
Reproduction
n/a info is above
Expected behavior
That the instructions in performance page should enable replication of the benchmarks
actual behavior
as described
additional notes
edit: this has label bug
but it should be documentation
I've just seen the following pages which suggest we can run some form of benchmarking (perhaps not exactly what you ran in the performance.md) page by using the pre-built version (i.e. not from source)
- https://github.com/NVIDIA/TensorRT-LLM/tree/main/benchmarks/python
- https://github.com/NVIDIA/TensorRT-LLM/blob/main/benchmarks/cpp/README.md
Is this the recommended way to replicate benchmarks now?
Perhaps this statement in the performance.md is out of date:
Additionally, the development container needs a copy of the source code to build the wheel and the benchmarking script
I am also trying to benchmark all I can see is there are lots of outdated files that are not functional. Wheels, scripts, installs, ... What works is the README tho the README you provided helps me. Thanks!