cmssw
cmssw copied to clipboard
`hltIntegrationTests` tests failing randomly in IBs
In recent IBs, there have been seemingly-random failures of the HLT-Validation tests, e.g.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-04-09-1100/slc7_amd64_gcc10/HLT_Integration_PIon_MC.log https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_3_X_2022-04-11-1100/slc7_amd64_gcc10/HLT_Integration_PIon_MC.log https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-04-11-2300/slc7_amd64_gcc10/HLT_Integration_PIon_MC.log https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-04-16-1100/slc7_amd64_gcc10/HLT_Integration_PIon_MC.log https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-04-16-1100/slc7_amd64_gcc10/HLT_Integration_PRef_MC.log
First occurrences of the issues were briefly discussed in
https://github.com/cms-sw/cmssw/pull/37304#issuecomment-1078234322 https://github.com/cms-sw/cmssw/pull/37524#issuecomment-1096751922
The cause of the issue is unclear. There is evidence that the issue is not reproducible locally, and in fact it seems to show up in IBs at random times. TSG also routinely runs these executables manually (i.e. not via IBs) during development, but I'm yet to encounter this issue locally.
The error messages point to a failure in downloading correctly the HLT config file from the database, via the hltListPaths
call here and/or the hltGetConfiguration
call here, as part of the executable hltIntegrationTests
.
Examples:
-
this error [1] suggests that the
hlt.py
dumped viahltGetConfiguration
was not a valid python config; -
this error [2] suggests that downloading the menu inside
hltListPaths
failed, and then the ensuing call tohltGetConfiguration
failed as well, causing an error fromhltCompareResults
(which read as input the invalid python config returned byhltGetConfiguration
).
To my knowledge, the issue started to appear after the integration of #37283 (and its backport to 12_3_X
) [3]. That PR updated hltListPaths
making it maybe a bit slower; on the other hand, it did not update hltGetConfiguration
in any way. Curiously, the error showed up so far only for the PIon and PRef HLT menus, which are the two smallest menus being tested (so, their download from the database is generally much quicker compared to other menus).
Given its non-reproducibility, it's unclear (to me) how to tackle this.
Could this be somehow an issue related to how these tests are run in IBs? (and/or how the database is queried in that case? are there any timeouts of any kind?)
[1]
stty: standard input: Inappropriate ioctl for device
Will run 6 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_3_0/PIon/V67 --full --offline --mc --input file:../RelVal_Raw_PIon_MC.root --unprescale --process TEST20220416171904 --max-events 100 --globaltag=auto:run3_mc_PIon --type=PIon
Traceback (most recent call last):
File "/pool/condor/dir_150973/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-04-16-1100/bin/slc7_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02728/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-04-15-1100/bin/slc7_amd64_gcc10/edmConfigDump", line 25, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
Status_OnCPU
Status_OnGPU
HLTriggerFirstPath
HLT_Physics_v7
make: *** [.makefile:23: Status_OnGPU.done] Error 90
HLT_Random_v3
make: *** [.makefile:23: Status_OnCPU.done] Error 90
HLT_ZeroBias_v6
make: *** [.makefile:23: HLT_Physics_v7.done] Error 90
make: Target 'Status_OnCPU' not remade because of errors.
make: Target 'Status_OnGPU' not remade because of errors.
make: Target 'HLT_Physics_v7' not remade because of errors.
make: *** [.makefile:23: HLTriggerFirstPath.done] Error 90
make: Target 'HLTriggerFirstPath' not remade because of errors.
make: *** [.makefile:23: HLT_Random_v3.done] Error 90
make: Target 'HLT_Random_v3' not remade because of errors.
make: *** [.makefile:23: HLT_ZeroBias_v6.done] Error 90
make: Target 'HLT_ZeroBias_v6' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
[2]
stty: standard input: Inappropriate ioctl for device
Traceback (most recent call last):
File "/pool/condor/dir_150973/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-04-16-1100/bin/slc7_amd64_gcc10/hltListPaths", line 190, in <module>
paths = getPathList(config)
File "/pool/condor/dir_150973/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-04-16-1100/bin/slc7_amd64_gcc10/hltListPaths", line 32, in getPathList
raise Exception(f'query did not return a valid HLT menu:\n query="{cmdline}"')
Exception: query did not return a valid HLT menu:
query="hltConfigFromDB --run3 --v3 --configName /dev/CMSSW_12_3_0/PRef/V67 --noedsources --noes --noservices"
Will run 0 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_3_0/PRef/V67 --full --offline --mc --input file:../RelVal_Raw_PRef_MC.root --unprescale --process TEST20220416171920 --max-events 100 --globaltag=auto:run3_mc_PRef --type=PRef
Traceback (most recent call last):
File "/pool/condor/dir_150973/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-04-16-1100/bin/slc7_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02728/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-04-15-1100/bin/slc7_amd64_gcc10/edmConfigDump", line 25, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
make: Target 'all' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
[3] Reverting #37283 in full is not a good option, because that PR introduced functionalities needed to test the latest HLT menus.
A new Issue was created by @missirol Marino Missiroli.
@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign core, hlt
New categories assigned: core,hlt
@missirol,@Dr15Jones,@smuzaffar,@makortel,@Martin-Grunewald you have been requested to review this Pull request/Issue and eventually sign? Thanks
I remember seeing this kind of errors
NameError: name 'cms' is not defined
recently in other tests too (was unable to find those now though). I wonder if this could be e.g. a CVMFS issue on a worker node?
Just noting here another occurrence of the issue in CMSSW_12_3_X_2022-04-25-2300
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_3_X_2022-04-25-2300/slc7_amd64_gcc10/runIB.log
02:28:48 hltIntegrationTests /dev/CMSSW_12_3_0/HIon/V72 -d HLT_Integration_HIon_MC -i file:../RelVal_Raw_HIon_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_HIon -x --type=HIon >& HLT_Integration_HIon_MC.log
2.097u 1.186s 0:05.99 54.5% 0+0k 2373016+976io 49958pf+0w
02:28:54 exit status: 1
02:28:54 hltIntegrationTests /dev/CMSSW_12_3_0/PIon/V72 -d HLT_Integration_PIon_MC -i file:../RelVal_Raw_PIon_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_PIon -x --type=PIon >& HLT_Integration_PIon_MC.log
2.131u 1.264s 0:10.79 31.4% 0+0k 4129304+1000io 21167pf+0w
02:29:05 exit status: 1
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_3_X_2022-04-25-2300/slc7_amd64_gcc10/HLT_Integration_HIon_MC.log
stty: standard input: Inappropriate ioctl for device
Traceback (most recent call last):
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltListPaths", line 190, in <module>
paths = getPathList(config)
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltListPaths", line 32, in getPathList
raise Exception(f'query did not return a valid HLT menu:\n query="{cmdline}"')
Exception: query did not return a valid HLT menu:
query="hltConfigFromDB --run3 --v3 --configName /dev/CMSSW_12_3_0/HIon/V72 --noedsources --noes --noservices"
Will run 0 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_3_0/HIon/V72 --full --offline --mc --input file:../RelVal_Raw_HIon_MC.root --unprescale --process TEST20220426022851 --max-events 100 --globaltag=auto:run3_mc_HIon --type=HIon
Traceback (most recent call last):
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02730/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_X_2022-04-24-0000/bin/slc7_amd64_gcc10/edmConfigDump", line 25, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
make: Target 'all' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_3_X_2022-04-25-2300/slc7_amd64_gcc10/HLT_Integration_PIon_MC.log
stty: standard input: Inappropriate ioctl for device
Traceback (most recent call last):
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltListPaths", line 190, in <module>
paths = getPathList(config)
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltListPaths", line 32, in getPathList
raise Exception(f'query did not return a valid HLT menu:\n query="{cmdline}"')
Exception: query did not return a valid HLT menu:
query="hltConfigFromDB --run3 --v3 --configName /dev/CMSSW_12_3_0/PIon/V72 --noedsources --noes --noservices"
Will run 0 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_3_0/PIon/V72 --full --offline --mc --input file:../RelVal_Raw_PIon_MC.root --unprescale --process TEST20220426022856 --max-events 100 --globaltag=auto:run3_mc_PIon --type=PIon
Traceback (most recent call last):
File "/pool/condor/dir_222511/jenkins/workspace/ib-run-HLT/CMSSW_12_3_X_2022-04-25-2300/bin/slc7_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02730/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_X_2022-04-24-0000/bin/slc7_amd64_gcc10/edmConfigDump", line 25, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
make: Target 'all' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
Another occurrence of the issue in CMSSW_12_3_X_2022-04-27-1100
.
Errors are similar to https://github.com/cms-sw/cmssw/issues/37598#issuecomment-1110265799. Example:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_3_X_2022-04-27-1100/slc7_amd64_gcc10/runIB.log
18:27:12 hltIntegrationTests /dev/CMSSW_12_3_0/PIon/V72 -d HLT_Integration_PIon_MC -i file:../RelVal_Raw_PIon_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_PIon -x --type=PIon >& HLT_Integration_PIon_MC.log
11.007u 2.286s 0:38.04 34.9% 0+0k 1986120+920io 52369pf+0w
18:27:50 exit status: 1
18:27:50 hltIntegrationTests /dev/CMSSW_12_3_0/PRef/V72 -d HLT_Integration_PRef_MC -i file:../RelVal_Raw_PRef_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_PRef -x --type=PRef >& HLT_Integration_PRef_MC.log
1.996u 0.622s 0:03.61 72.2% 0+0k 524288+944io 1778pf+0w
18:27:54 exit status: 1
Another occurrence of the issue in CMSSW_12_3_X_2022-05-02-2300
. Errors are similar to https://github.com/cms-sw/cmssw/issues/37598#issuecomment-1110265799.
In the last 10 days, the issue has continued to appear in 12_3_X
IBs, but not in 12_4_X
IBs (maybe it is just a coincidence). The HLT menus in those releases are the same. Is there anything different in how IBs run for 12_3_X
and 12_4_X
? (generic question, but I'm trying to figure out if something could explain the apparent lack of issues in recent 12_4_X
IBs)
Another occurrence of the issue in CMSSW_12_4_X_2022-05-13-2300
. Errors are similar to https://github.com/cms-sw/cmssw/issues/37598#issuecomment-1110265799, but this time only for the HIon
menu.
Is there anything different in how IBs run for
12_3_X
and12_4_X
?
This latest failure was in 12_4_X
(master
), suggesting that there might be no differences between 12_3_X
IBs and 12_4_X
IBs for what concerns this particular problem.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-05-13-2300/slc7_amd64_gcc10/runIB.log
[..]
03:01:32 hltIntegrationTests /dev/CMSSW_12_3_0/GRun/V79 -d HLT_Integration_GRun_MC -i file:../RelVal_Raw_GRun_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_GRun -x --type=GRun >& HLT_Integration_GRun_MC.log
25416.330u 6401.524s 3:37:51.43 243.4% 0+0k 1954842544+1449568io 8562759pf+0w
06:39:23 exit status: 0
06:39:23 hltIntegrationTests /dev/CMSSW_12_3_0/HIon/V79 -d HLT_Integration_HIon_MC -i file:../RelVal_Raw_HIon_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_HIon -x --type=HIon >& HLT_Integration_HIon_MC.log
109.351u 51.588s 2:45.39 97.3% 0+0k 30597312+12816io 315924pf+0w
06:42:09 exit status: 1
06:42:09 hltIntegrationTests /dev/CMSSW_12_3_0/PIon/V79 -d HLT_Integration_PIon_MC -i file:../RelVal_Raw_PIon_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_PIon -x --type=PIon >& HLT_Integration_PIon_MC.log
105.738u 15.689s 1:35.85 126.6% 0+0k 13181608+73944io 74766pf+0w
06:43:45 exit status: 0
06:43:45 hltIntegrationTests /dev/CMSSW_12_3_0/PRef/V79 -d HLT_Integration_PRef_MC -i file:../RelVal_Raw_PRef_MC.root -n 100 -j 4 --mc -x --globaltag=auto:run3_mc_PRef -x --type=PRef >& HLT_Integration_PRef_MC.log
2582.033u 907.994s 51:46.90 112.3% 0+0k 97426000+429728io 462013pf+0w
07:35:32 exit status: 0
[..]
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-05-13-2300/slc7_amd64_gcc10/HLT_Integration_HIon_MC.log
stty: standard input: Inappropriate ioctl for device
Will run 429 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_3_0/HIon/V79 --full --offline --mc --input file:../RelVal_Raw_HIon_MC.root --unprescale --process TEST20220514064003 --max-events 100 --globaltag=auto:run3_mc_HIon --type=HIon
Traceback (most recent call last):
File "/pool/condor/dir_18534/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-05-13-2300/bin/slc7_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc10/cms/cmssw-patch/CMSSW_12_4_X_2022-05-13-2300/bin/slc7_amd64_gcc10/edmConfigDump", line 26, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
[..]
Another occurrence of the issue in CMSSW_12_4_X_2022-05-17-2300
.
This intermittent issue keeps appearing, so it might be useful to start thinking about a way to solve it via software (e.g. retrying the query).
Other occurrences of this issue in
CMSSW_12_5_X_2022-05-23-2300
CMSSW_12_5_X_2022-05-27-1100
The problem hasn't shown up in the IBs of the last ten days, or so.
I don't know why; I just wonder if anything related to the DB (and/or the queries to it) has changed.
As far as I can see, this problem has not re-appeared, so something must have improved. :)
The issue re-appeared in CMSSW_12_4_X_2022-08-12-1100
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-08-12-1100/el8_amd64_gcc10/HLT_Integration_PIon_DATA.log
stty: 'standard input': Inappropriate ioctl for device
Will run 6 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_4_0/PIon/V94 --full --offline --data --input file:../RelVal_Raw_PIon_DATA.root --unprescale --process TEST20220812172737 --max-events 100 --globaltag=auto:run3_hlt_PIon --type=PIon
Traceback (most recent call last):
File "/pool/condor/dir_39524/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-08-12-1100/bin/el8_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02745/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-08-11-2300/bin/el8_amd64_gcc10/edmConfigDump", line 26, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
HLTriggerFirstPath
Status_OnGPU
HLT_Physics_v8
Status_OnCPU
make: *** [.makefile:23: HLT_Physics_v8.done] Error 90
make: *** [.makefile:23: HLTriggerFirstPath.done] Error 90
HLT_Random_v3
HLT_ZeroBias_v7
make: *** [.makefile:23: Status_OnCPU.done] Error 90
make: Target 'HLTriggerFirstPath' not remade because of errors.
make: Target 'Status_OnCPU' not remade because of errors.
make: Target 'HLT_Physics_v8' not remade because of errors.
make: *** [.makefile:23: Status_OnGPU.done] Error 90
make: Target 'Status_OnGPU' not remade because of errors.
make: *** [.makefile:23: HLT_Random_v3.done] Error 90
make: *** [.makefile:23: HLT_ZeroBias_v7.done] Error 90
make: Target 'HLT_Random_v3' not remade because of errors.
make: Target 'HLT_ZeroBias_v7' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-08-12-1100/el8_amd64_gcc10/HLT_Integration_PRef_DATA.log
stty: 'standard input': Inappropriate ioctl for device
Traceback (most recent call last):
File "/pool/condor/dir_39524/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-08-12-1100/bin/el8_amd64_gcc10/hltListPaths", line 190, in <module>
paths = getPathList(config)
File "/pool/condor/dir_39524/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-08-12-1100/bin/el8_amd64_gcc10/hltListPaths", line 32, in getPathList
raise Exception(f'query did not return a valid HLT menu:\n query="{cmdline}"')
Exception: query did not return a valid HLT menu:
query="hltConfigFromDB --run3 --v3 --configName /dev/CMSSW_12_4_0/PRef/V94 --noedsources --noes --noservices"
Will run 0 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_4_0/PRef/V94 --full --offline --data --input file:../RelVal_Raw_PRef_DATA.root --unprescale --process TEST20220812172745 --max-events 100 --globaltag=auto:run3_hlt_PRef --type=PRef
Traceback (most recent call last):
File "/pool/condor/dir_39524/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-08-12-1100/bin/el8_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 10, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02745/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-08-11-2300/bin/el8_amd64_gcc10/edmConfigDump", line 26, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 10, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
make: Target 'all' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
Another instance of the issue was in CMSSW_12_5_X_2022-08-17-1100
. The errors are virtually identical to https://github.com/cms-sw/cmssw/issues/37598#issuecomment-1217514871.
Another instance of the issue was in CMSSW_12_5_X_2022-08-24-2300
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_5_X_2022-08-24-2300/el8_amd64_gcc10/HLT_Integration_HIon_DATA.log
stty: 'standard input': Inappropriate ioctl for device
Traceback (most recent call last):
File "/pool/condor/dir_60795/jenkins/workspace/ib-run-HLT/CMSSW_12_5_X_2022-08-24-2300/bin/el8_amd64_gcc10/hltListPaths", line 190, in <module>
paths = getPathList(config)
File "/pool/condor/dir_60795/jenkins/workspace/ib-run-HLT/CMSSW_12_5_X_2022-08-24-2300/bin/el8_amd64_gcc10/hltListPaths", line 32, in getPathList
raise Exception(f'query did not return a valid HLT menu:\n query="{cmdline}"')
Exception: query did not return a valid HLT menu:
query="hltConfigFromDB --run3 --v3 --configName /dev/CMSSW_12_4_0/HIon/V110 --noedsources --noes --noservices"
Will run 0 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
HLT menu: hltGetConfiguration /dev/CMSSW_12_4_0/HIon/V110 --full --offline --data --input file:../RelVal_Raw_HIon_DATA.root --unprescale --process TEST20220825045934 --max-events 100 --globaltag=auto:run3_hlt_HIon --type=HIon
Traceback (most recent call last):
File "/pool/condor/dir_60795/jenkins/workspace/ib-run-HLT/CMSSW_12_5_X_2022-08-24-2300/bin/el8_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 5, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/week1/el8_amd64_gcc10/cms/cmssw/CMSSW_12_5_X_2022-08-24-2300/bin/el8_amd64_gcc10/edmConfigDump", line 26, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 5, in <module>
process.source = cms.Source( "PoolSource",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
make: Target 'all' not remade because of errors.
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
Another instance of the issue was in CMSSW_12_5_X_2022-08-30-2300
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_5_X_2022-08-30-2300/el8_amd64_gcc10/HLT_Integration_PRef_MC.log
Another instance of this issue was in CMSSW_12_4_X_2022-10-07-1100
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-10-07-1100/el8_amd64_gcc10/HLT_Integration_PIon_MC.log
Another instance of this issue was in CMSSW_12_4_X_2022-10-11-1100
, albeit with a somewhat new error message [*].
(I know I sound like a broken record; I just mean to highlight that the issue persists; when there are less urgent matters, I will try to come up with a solution, e.g. https://github.com/cms-sw/cmssw/issues/39345#issuecomment-1244964477; ETA: EOY).
[*] https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-10-11-1100/el8_amd64_gcc10/HLT_Integration_GRun_MC.log
stty: 'standard input': Inappropriate ioctl for device
Will run 674 HLT paths over 100 events, with 4 jobs in parallel
Extracting full menu dump
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 1349, in getresponse
response.begin()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 285, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/util/retry.py", line 532, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-urllib3/1.26.6-504ee060441080cce4ff715292ff47ca/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 1349, in getresponse
response.begin()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 316, in begin
version, status, reason = self._read_status()
File "/cvmfs/cms-ib.cern.ch/week0/el8_amd64_gcc10/external/python3/3.9.6-67e5cf5b4952101922f1d4c8474baa39/lib/python3.9/http/client.py", line 285, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/pool/condor/dir_35162/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-10-11-1100/bin/el8_amd64_gcc10/hltGetConfiguration", line 251, in <module>
print(confdb.HLTProcess(config).dump())
File "/pool/condor/dir_35162/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-10-11-1100/python/HLTrigger/Configuration/Tools/confdb.py", line 53, in __init__
self.converter = OfflineConverter(version = self.config.menu.version, database = self.config.menu.database, proxy = self.config.proxy, proxyHost = self.config.proxy_host, proxyPort = self.config.proxy_port)
File "/pool/condor/dir_35162/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-10-11-1100/python/HLTrigger/Configuration/Tools/confdbOfflineConverter.py", line 131, in __init__
version_website = requests.get(self.baseUrl+"/../confdb.version").text
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/external/py3-requests/2.26.0-0d6433445dfa3a94b84d1ce98b51f46e/lib/python3.9/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
HLT menu: hltGetConfiguration /dev/CMSSW_12_4_0/GRun/V145 --full --offline --mc --input file:../RelVal_Raw_GRun_MC.root --unprescale --process TEST20221011111238 --max-events 100 --globaltag=auto:run3_mc_GRun --type=GRun
Traceback (most recent call last):
File "/pool/condor/dir_35162/jenkins/workspace/ib-run-HLT/CMSSW_12_4_X_2022-10-11-1100/bin/el8_amd64_gcc10/hltCheckPrescaleModules", line 25, in <module>
exec(open(name).read(), globals(), menu.__dict__)
File "<string>", line 4, in <module>
NameError: name 'cms' is not defined
Traceback (most recent call last):
File "/cvmfs/cms-ib.cern.ch/nweek-02754/el8_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-10-09-0000/bin/el8_amd64_gcc10/edmConfigDump", line 26, in <module>
loader.exec_module(mod)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "hlt.py", line 4, in <module>
process.hltTriggerSummaryAOD = cms.EDProducer( "TriggerSummaryProducerAOD",
NameError: name 'cms' is not defined
Preparing single-path configurations
Running...
full menu dump
make: *** [.makefile:17: hlt.done] Error 90
HLT_AK8PFJet360_TrimMass30_v20
Status_OnGPU
Status_OnCPU
HLTriggerFirstPath
make: *** [.makefile:23: Status_OnCPU.done] Error 90
HLT_AK8PFJet380_TrimMass30_v13
[..]
Comparing the results of running each path by itself with those from the full menu
ERROR: Execution of the full HLT menu failed.
Please check the contents of 'hlt.log' for details.
exit status: 1
done
Another instance of this issue was in CMSSW_12_4_X_2022-11-01-1100
.
https://cmssdt.cern.ch/SDT/jenkins-artifacts/HLT-Validation/CMSSW_12_4_X_2022-11-01-1100/el8_amd64_gcc10/HLT_Integration_PIon_MC.log
+hlt
I will try to come up with a solution, e.g. https://github.com/cms-sw/cmssw/issues/39345#issuecomment-1244964477; ETA: EOY
#40004 and its backports have removed queries to ConfDB
in IB tests. This should, by construction, remove occurrences of this issue for 12_4_X
and higher, so I'm signing this.
Having said that, the root cause of these failures (see also #39345) still escapes me. The symptom is a failure in downloading configurations from ConfDB (only some of them usually, during the same IB), which leads to invalid cfg files. The nodes running tests in IB don't have /afs
access, so the ConfDB .jar
files are downloaded locally, but it's unclear (to me) whether or not this is part of the issue. The code seems to account for the fact that multiple downloads can happen simultaneously, but I didn't try to stress-test this.
+core
Although in the end there wasn't much (anything?) for core.
@cmsbuild, please close
This issue is fully signed and ready to be closed.