Todor Ivanov
Todor Ivanov
hi @amaltaro @anpicci @vkuznet with our latest commits to https://github.com/dmwm/CMSKubernetes/pull/1466 we ( me @LinaresToine @germanfgv ) managed to initialize a T0 agent alma9 machine properly and test this deployment process...
We are in the process of final tests here. More details I gave in my comment to the PR with which I called @amaltaro and @anpicci for final review: https://github.com/dmwm/CMSKubernetes/pull/1466#issuecomment-2107090214
@vkuznet @amaltaro I want to make one really important remark, which we should always keep in our heads while working on that. * Many of these logs contain **sensitive information**....
Hi @germanfgv @jhonatanamado @amaltaro, Let me see if I can grasp the goal of this issue correctly. Here is one T0 `PromptReco` request `PromptReco_Run349840_Cosmics_Tier0_REPLAY_2022_ID220511165314_v429_220511_1703`, which has failed jobs in it....
hi [German](@germanfgv), While working on that and trying to observe the issue with an agent in production, I kind of found this feature is working well in the production system....
Hi @germanfgv , while working with the above mentioned workflow it is indeed missing the lumi lists for broken jobs in `t0reqmon`: [1] But it seems to be having them...
Here follow few more observations and one helpful document added to the troubleshooting wiki of WMCore: [1] While working withthe T0 workflows I also checked the Production Validation and I...
> Exactly, most of the time we get no lumis info in WMStats, but sometimes we get these lists of [0] that don't offer much info. I am starting to...
Just for logging purposes: I have double checked all `couch views` and `couchapps` in order to prove there is no problem with how we fetch the information related to job...