WMCore
WMCore copied to clipboard
WMCore training plans for 2024-Q1/Q2
Impact of the new feature WMCore in general
Is your feature request related to a problem? Please describe. Outreach and sharing knowledge such that the team is uniformly capable ot work across areas of the workload management realm. Not only the team, but also any CompOps and graduate enthusiasts!
Describe the solution you'd like The solution is to provide a documentation on each of the topics we have identified and present it over Zoom to the core team and anyone else interested in CMS Workload Management. For the documentation, if it already exists, we only need to review it and make sure it's good enough. If it does not exist, we should create it and base the presentation on that (no need to create separate slides). Presentation is supposed to be recorded to facilitate future training.
For the moment, we have identified the following 10 topics that deserve a dedicated training session:
- [x] Data flow/acquisition in WM
- [x] Services and job logs
- [x] Monitoring used - directly and indirectly - in WM
- [ ] WMAgent deployment and draining
- [ ] Standard WM debugging
- [ ] Interactive grid job execution
- [x] Microservices
- [ ] Request Manager and its CherryPy threads
- [ ] WMStats and its CherryPy threads
- [ ] Global WorkQueue and its CherryPy threads
- [ ] Advanced Debugging I (topics TBD)
- [ ] Advanced Debugging II (topics TBD)
Describe alternatives you've considered None Additional context We plan to allocate 2 persons for each of this subject, one being considerably new to the project. Said that, it would potentially be a mix up of Alan/Todor/Kenyi with Dennis/Valentin/Andrea.
Considering that @vkuznet mentioned the CouchDB and MongoDB debugging the Q1/2024 straw-person plan, maybe could be interesting to have training for interacting/using CouchDB and MongoDB, if it is useful to this part of the debugging
Copied from @vkuznet's comment in the Google Doc.
We may easily add the following:
- MS debugging
- low/high load in a system, i.e. why there are low number of workflows and how to address this. Does the system is over stressed with large number of workflows in a system and how to ease this situation
- CouchDB debugging
- MariaDB debugging and its queries
- WMA errors codes explanation and how to navigate them
- How to submit test workflow and debug its execution
- WMA state transition debugging
These list can be expanded further by looking into data-ops/PnR issues they posted on MM and seeking issue resolution. I only posted a few which came up last year.
@vkuznet and all: I think the above is a good list, but I'm worried that it's too fine grained to dedicate one session to each bullet point. As a compromise, I've added two "Advanced Debugging" sessions to the list above, topics TBD later. We can always expand further, but I think it would be nice to set the scope at something achievable, complete it, and then evaluate the next target.
Another suggestions would be decide which tool set to use for tutorials. We had so many wrappers around plain HTTP protocol that many believes that without such tools we can't work with service APIs. A typical example is DBS, where many users relies on dbsClient while it is much simpler to use curl and avoid complicated dependencies. Said that, may I suggest to have dedicated topic for tools and their usage. My vote will go to use curl
for almost everything in CMS WEB/Services universe and only complement it with wrapper tools. This suggestion also applies to working with CouchDB (and its views) and possible MongoDB (if we run server with http mode). It also will teach users to understand protocol and headers and work across many OSes rather specific tool/specific OS.
Even though I agree that using curl as a general HTTP tool for interacting with (external) services, I think this should be left up to the discretion of the person that will be carrying out the training. There is no really right tool and/or right way of debugging things and each person can and should be free to perform that as they see fit.
I will soon start creating WMCore issues for each of those 12 tasks/topics that we have identified so far.
Even though I agree that using curl as a general HTTP tool for interacting with (external) services, I think this should be left up to the discretion of the person that will be carrying out the training. There is no really right tool and/or right way of debugging things and each person can and should be free to perform that as they see fit.
I will soon start creating WMCore issues for each of those 12 tasks/topics that we have identified so far.
Alan, the main difference is a choice of OS and toolset. Usage of CMS stack software is not trivial, e.g. even python is installed on a a system it does not mean that you can easily use CMS tools with it which requires additional, quite complex, setup. With CMS tools you always stuck with the same tools which a-priory are not required to be used with any HTTP services, with curl you always free with choice of OS and do not need CMS software per-se. As such, you can pick up any OS and use curl/wget/or any http based tool to debug CMS services. And, as I said you gain better understanding of protocol, headers, etc. But I agree it is a choice of person who will carry the training. I only provided my feedback rather then asking to enforce it.