DIRAC
DIRAC copied to clipboard
Reducing the memory consumption of the `PushJobAgent`
The objective is to enhance the exploitation of HPCs with no external connectivity in DIRAC. The current workflow is limited:
- The
PushJobAgentonly works if you use thedirac-jobexecexecutable. - The
PushJobAgentsupports a very limited number of jobs in parallel (~150 jobs would consume ~50GB of memory on your DIRAC server).
I would like to greatly reduce these limitations by deploying a series of PRs:
- [X] split the
JobWrapper.execute()method into 3 sub methods (preProcess,process,postProcess) to better isolate operations involving communications with the external (DIRAC) services from the payload itself. (https://github.com/DIRACGrid/DIRAC/pull/7460) - [x] introduce the
JobWrapperOfflineTemplatethat solely executes theprocessmethod. (https://github.com/DIRACGrid/DIRAC/pull/7529) - [ ] adapt the
PushJobAgentto these changes so that it submits aJobWrapperOfflineTemplatedirectly to a remote CE. For a while, both the "traditional" and the new approaches will be supported. (https://github.com/DIRACGrid/DIRAC/pull/7587)
https://github.com/DIRACGrid/DIRAC/pull/7422 contains a overview of the full picture if you are interested in (will not be merged).