aldbr

Results 22 comments of aldbr

https://mattermost.web.cern.ch/cernvm/pl/kgtpiz81htd7pk45xa7jjq4nie

I made an analysis a few months ago with the jupyter notebook I created here (I needed to tweak it a bit) https://zenodo.org/records/5647834 I got the following results: ![image](https://github.com/DIRACGrid/DIRAC/assets/19813740/4a8ae814-1e42-4d9c-a417-4333dedced6c) By...

Well, that would be much simpler I have to admit. I will give it a try. If this works, we can go for that for `HTCondorComputingElement` and `LocalCE + Condor`....

Another solution for the `LocalCE+ParallelLibrary` could consist of embedding the content of the executable in the parallel library executable. For instance the srun (a `ParallelLibrary` class) wrapper could contain: ```...

Update: IIRC, the main "issue" is still the usage of the local scheduler in `HTCondorCE` here. - 3 years ago, I proposed to stop using it in https://github.com/DIRACGrid/DIRAC/pull/5137 but then...

> (C)PUTimeLeft is the real time allowed in the queue. unit: real seconds. In case there's only one payload (so, no filling mode) then this is equivalent to WallClockTime (see...

So, if I try to sum up what we said: To get the right job to run on a Worker Node, we need: **Time**: Time left estimation in real seconds...

**Update** The original issue was related to the use of Slurm, which is a batch system only based on wallclock time, in multi-core environments. We proposed a `SlurmResourceUsage` (https://github.com/DIRACGrid/DIRAC/pull/4482, https://github.com/DIRACGrid/DIRAC/pull/4673)...

I am closing this issue as we are going to move to DiracX (with CWL at some point). We will leverage Pydantic models and this will "force" us to clearly...