inference icon indicating copy to clipboard operation
inference copied to clipboard

BERT Benchmark unable to execute successfully.

Open willamloo3192 opened this issue 1 year ago • 27 comments

Command: cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=bert-99 --implementation=reference --backend=pytorch --device=cuda --scenario=Offline --adr.compiler.tags=gcc --target_qps=1 --category=edge --division=open --env.CM_VERIFY_SSL=false OS Version: Ubuntu 22.04 with kernel 6.5.0 CUDA Version: 12.0 Pytorch version: 2.2.1

Error Message: Loading BERT configs... Loading PyTorch model... Traceback (most recent call last): File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/run.py", line 150, in main() File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/run.py", line 75, in main sut = get_pytorch_sut(args) File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/pytorch_SUT.py", line 111, in get_pytorch_sut return BERT_PyTorch_SUT(args) File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/pytorch_SUT.py", line 60, in init self.model.load_state_dict(torch.load(model_file), strict=False) File "/home/user/cm/lib/python3.10/site-packages/torch/serialization.py", line 1040, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/home/user/cm/lib/python3.10/site-packages/torch/serialization.py", line 1258, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, '\x0a'. Finished destroying SUT.

Traceback (most recent call last): File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/accuracy-squad.py", line 449, in main() File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/accuracy-squad.py", line 433, in main results = load_loadgen_log( File "/home/user/CM/repos/local/cache/55013b57c45543c7/inference/language/bert/accuracy-squad.py", line 346, in load_loadgen_log with open(log_path) as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/user/CM/repos/local/cache/06800ef908814fab/test_results/default-reference-gpu-pytorch-v2.2.1-default_config/bert-99/offline/accuracy/mlperf_log_accuracy.json'

CM error: Portable CM script failed (name = process-mlperf-accuracy, return code = 256)

willamloo3192 avatar Feb 27 '24 18:02 willamloo3192