cakeml
cakeml copied to clipboard
Improving regression parallelism
Except for the bootstraps, most of the directories have phases where only a few low-memory jobs are running and they could usefully be folded together. However, we still want to be able to limit oversubscription of bootstraps and to have output in a consistent order for easy comparison, which suggests changes to the Holmake logic. (Do we want to try to upstream those changes, fork it, or upstream a refactoring to make the actual build logic into a library and just fork the scheduler/display code?)
I'm thinking something along the lines of:
- each job annotated with a memory limit, which is enforced automatically with --maxheap
- scheduler avoids running jobs simultaneously based on memory limit and a global memory limit
- the regression.cgi page displays a fixed preorder traversal of all Holmake build rules, possibly grouped by directory, and for each rule shows the current status (waiting for dependencies, waiting for CPU/MEM slots, running for X time and memory, succeeded/failed after X time and memory)
- nice to have: create cgroups for more accurate memory accounting