flux-sched
flux-sched copied to clipboard
Need elastic job aware match policy and scoring
The scheduler must be able to match resources allocated by elastic jobs. However, the scoring mechanism should prefer free resources to those allocated to elastic jobs. This new functionality will need metadata proposed in issue #551.
This issue needs to be addressed in the same PR as issue #564.
A dom_finish_vtx
must be written where the vtx is matched and is allocated to an elastic job, the vertex needs to be added to a new data structure in scoring_api_t
that associates ~~vertices to elastic job IDs~~ elastic job IDs to vertices.
A dom_finish_graph
is needed that will submit each matched and high-scoring subgraph corresponding to part of an allocated elastic job to its corresponding child as a "shrink" job. Error handling and scoring will be tricky, especially in the future where we want to allow the children to only partially honor a parent's "shrink" request.
An important and open question is how to score a subgraph corresponding to part of an allocated elastic job lower than if it were free. Also, we want to strongly prefer free resources to those allocated to an elastic job. This is because the shrink demand/request may take a while to return from the child.
@dongahn do you have any comments or suggestions along these lines?
You should be able to give a higher number if the resource vertex doesn't have an allocation.
Maybe you can use a large constant number if the vertex is free and add it to other policy score. This way, free resources will be selected first than elastically allocated resources will be considered next; yet still other polices (e.g., high id resource will be preferred) can be effected as well.
@tpatki has a hands on experience and may be able to help you with this.