inference issues

Make a video

19

Hello. I have been trying to run at least 1 test for a long time and I constantly get errors. Please record a video or give me a link so...

Agalakdak

The documentation is not showing correctly...

3

Check https://docs.mlcommons.org/inference/benchmarks but found: @ashwin @jdduke @codyaustun @badenh @koichishirahata

zhimin-z

CM error: Extracted TensorRT folder does not seem proper - Version information missing!

1

I'm getting this new error for tensorrt framework bert inference implementation on grace-hopper 200: Here is the output: ```sh Apptainer> cm run script "get tensorrt _dev" --tar_file=/my/path/TensorRT-10.1.0.27.Ubuntu-22.04.aarch64-gnu.cuda-12.4.tar.gz INFO:root:* cm run...

thehalfspace

Automated command for llama2-70b: Changing Batch Size fails

17

Hello mlcommons team, I want to run the "Automated command to run the benchmark via MLCommons CM" (from the example: https://github.com/mlcommons/inference/tree/master/language/llama2-70b) with a different batch size, but I am getting...

philross

Querying intermediate results

1

I am running MLPerf Inference datacenter suite on a CPU only device following the instructions on the [documentation](https://docs.mlcommons.org/inference/benchmarks/language/llama2-70b/). The suggested sample size/query counts seem to take a very long time...

rajesh-s

Disambiguating execution modes

1

What are the difference between the `test` and `valid` modes list on the [documentation](https://docs.mlcommons.org/inference/benchmarks/language/llama2-70b/)? Since this is intended to be a reproducible bechmark suite, would it be possible to add...

rajesh-s

4.1 random number generation seeds

we need 4.1 seeds for loadgen and stable diffusion. The main inference group have them already, let's reuse their seeds. https://github.com/mlcommons/inference/pull/1736

freedomtan

Loadgen Python API does not expose server_num_issue_query_threads

1

Loadgen Python API does not expose `server_num_issue_query_threads` in mlperf::TestSettings. It seems users cannot set this attribute directly from Python. Will create a PR to add this in pybind module.

ever-wong

nvidia build issues

19

o, I have now 4 solid test scenarios thanks to everyone's help here. The have all been tested in cpu mode. I am now switching to nvidia and the docker...

howudodat

TypeError

python3 -u main.py --scenario Offline --model-path ${CHECKPOINT_PATH} --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 24576 --dataset-path ${DATASET_PATH} --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log Traceback (most recent call last):...

allthemight

inference
inference copied to clipboard

Metadata

Make a video

The documentation is not showing correctly...

CM error: Extracted TensorRT folder does not seem proper - Version information missing!

Automated command for llama2-70b: Changing Batch Size fails

Querying intermediate results

Disambiguating execution modes

4.1 random number generation seeds

Loadgen Python API does not expose server_num_issue_query_threads

nvidia build issues

TypeError

← Metadata

Owner

Metadata

inference inference copied to clipboard

Metadata

← Metadata

Owner

Metadata

inference
inference copied to clipboard