inference icon indicating copy to clipboard operation
inference copied to clipboard

Submission checker calls warning as error

Open rakshithvasudev opened this issue 3 years ago • 1 comments

Hello There!

I noticed that the power checker classifies the power warnings as an error.

Here is the output of using the power checker, it passes and has no errors:

[rakshith@node081 Dell]$ python3 build/power-dev/compliance/check.py results/XR12_A2x1_TRT_MaxQ/resnet50/Server/performance/
[x] Check client sources checksum
[x] Check server sources checksum
[x] Check PTD commands and replies
[x] Check UUID
[x] Check session name
[x] Check time difference
[x] Check client server messages
[x] Check results checksum
[x] Check errors and warnings from PTD logs
        '07-28-2022 12:37:47.000: ERROR: USB.' in ptd_log.txt during testing stage but it is accepted. Treated as WARNING

[x] Check PTD configuration
[x] Check debug is disabled on server-side

All checks passed. Warnings encountered, check for audit!

[rakshith@node081 v2.1]$ python3.8 closed/Dell/build/inference/tools/submission/submission-checker.py --more-power-check --input . --submitter Dell
[2022-09-13 19:51:55,418 log_parser.py:50 INFO] Sucessfully loaded MLPerf log from closed/Dell/results/XR12_A2x1_TRT_MaxQ/resnet50/Server/accuracy/mlperf_log_detail.txt.
[2022-09-13 19:51:55,419 submission-checker.py:1619 INFO] Detected power logs for closed/Dell/results/XR12_A2x1_TRT_MaxQ/resnet50/Server
[2022-09-13 19:51:55,420 log_parser.py:50 INFO] Sucessfully loaded MLPerf log from closed/Dell/results/XR12_A2x1_TRT_MaxQ/resnet50/Server/performance/run_1/mlperf_log_detail.txt.
[2022-09-13 19:51:55,421 log_parser.py:50 INFO] Sucessfully loaded MLPerf log from closed/Dell/results/XR12_A2x1_TRT_MaxQ/resnet50/Server/performance/run_1/mlperf_log_detail.txt.
[2022-09-13 19:51:55,421 submission-checker.py:1153 INFO] Target latency: 15000000, Early Stopping Latency: 15000000, Scenario: Server
[2022-09-13 19:51:55,422 log_parser.py:50 INFO] Sucessfully loaded MLPerf log from closed/Dell/results/XR12_A2x1_TRT_MaxQ/resnet50/Server/performance/run_1/mlperf_log_detail.txt.
[ ] Check client sources checksum
        {'__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'client.py': '33ca4f26368777ac06e01f9567b714a4b8063886', 'lib/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'lib/client.py': '4c2b78fb4849a7e5b584ef792d82aaed20b17f57', 'lib/common.py': '624d0c0acc7c39aaff3674f0b99d6a09da53d1dc', 'lib/external/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'lib/external/ntplib.py': '4da8f970656505a40483206ef2b5d3dd5e81711d', 'lib/server.py': 'cda0cdfaee9bfa1249c64a7dd4f89b7bf1b279f0', 'lib/source_hashes.py': '60a2e02193209e8d392803326208d5466342da18', 'lib/summary.py': 'aa92f0a3f975eecd44d3c0cd0236342ccc9f941d', 'lib/time_sync.py': '3210db56eb0ff0df57bf4293dc4d4b03fffd46f1', 'server.py': 'c3f90f2f7eeb4db30727556d0c815ebc89b3d28b', 'tests/unit/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'tests/unit/test_server.py': '99ae15aef722f2000ee6ed1ae1523637bf1ae42b', 'tests/unit/test_source_hashes.py': '00468a2907583c593e6574a1f6b404e4651c221a'} do not exist in 'sources_checksums.json'

[x] Check server sources checksum
[x] Check PTD commands and replies
[x] Check UUID
[x] Check session name
[x] Check time difference
[x] Check client server messages
[x] Check results checksum
[x] Check errors and warnings from PTD logs
        '07-28-2022 12:37:47.000: ERROR: USB.' in ptd_log.txt during testing stage but it is accepted. Treated as WARNING

[x] Check PTD configuration
[x] Check debug is disabled on server-side

ERROR: Not all checks passed. Warnings encountered, check for audit!

Here are my branches source files:

Inference repo: branch r2.1 power-dev repo: on commit 4993e22bda0bca15ba32d5ef12d98db6a72a618a (This is the latest commit as of now)

It appears like this is a file hash issue, not sure what I'm missing. Any help here would be appreciated.

FYI: I also pulled power-dev to branch r2.1 it still didn't help.

rakshithvasudev avatar Sep 13 '22 20:09 rakshithvasudev

@rakshithvasudev The error you get is not the submission checker classifying a warning as an error. Notice that when the submission checker calls the power check, the following check doesn't pass:

[ ] Check client sources checksum
        {'__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'client.py': '33ca4f26368777ac06e01f9567b714a4b8063886', 'lib/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'lib/client.py': '4c2b78fb4849a7e5b584ef792d82aaed20b17f57', 'lib/common.py': '624d0c0acc7c39aaff3674f0b99d6a09da53d1dc', 'lib/external/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'lib/external/ntplib.py': '4da8f970656505a40483206ef2b5d3dd5e81711d', 'lib/server.py': 'cda0cdfaee9bfa1249c64a7dd4f89b7bf1b279f0', 'lib/source_hashes.py': '60a2e02193209e8d392803326208d5466342da18', 'lib/summary.py': 'aa92f0a3f975eecd44d3c0cd0236342ccc9f941d', 'lib/time_sync.py': '3210db56eb0ff0df57bf4293dc4d4b03fffd46f1', 'server.py': 'c3f90f2f7eeb4db30727556d0c815ebc89b3d28b', 'tests/unit/__init__.py': 'da39a3ee5e6b4b0d3255bfef95601890afd80709', 'tests/unit/test_server.py': '99ae15aef722f2000ee6ed1ae1523637bf1ae42b', 'tests/unit/test_source_hashes.py': '00468a2907583c593e6574a1f6b404e4651c221a'} do not exist in 'sources_checksums.json'

Can you check the file compliance/sources_checksums.json has the following value "lib/server.py": "cda0cdfaee9bfa1249c64a7dd4f89b7bf1b279f0"? (the last change made to this file) It should be there if you are using the inference r2.1 branch.

You can also try and clone the repository again using the following command line: git clone --recurse-submodules https://github.com/mlcommons/inference.git --branch r2.1 --depth 1

pgmpablo157321 avatar Sep 21 '22 21:09 pgmpablo157321