maestrowf icon indicating copy to clipboard operation
maestrowf copied to clipboard

Add execution order controls

Open jwhite242 opened this issue 1 year ago • 5 comments

Add machinery for controlling execution order/priority of study steps

  • Adds weights to the graph to enable selecting between depth first and breadth first (current production mode) execution order.
  • Adds new execution block for exposing controls of orders and future hooks for using various step metadata for controlling step weights
  • Changes internal machinery to use a PriorityQueue to store the ready steps

jwhite242 avatar Jul 18 '23 17:07 jwhite242

@jwhite242 -- Is this a dead end at this point? We're assessing Maestro and it's looking like DFS would be good on our end too. Wondering just in case we need to revisit.

FrankD412 avatar Nov 13 '23 22:11 FrankD412

@jwhite242 -- Is this a dead end at this point? We're assessing Maestro and it's looking like DFS would be good on our end too. Wondering just in case we need to revisit.

No, i just got derailed by other things for a bit. Am ramping back up on this now. Having the more general expression based priorities are going to be pretty helpful -> major use case here being getting big/long running variants of steps running sooner, allowing smaller ones to be churned through within the throttle limit alongside it for improved throughput.

Thinking more on the protocol question.. I'm on the fence on whether we shouldn't just use abstract base classes and tie info to these things; i.e. per step overrides of expressions. But will play with both and see how they feel

jwhite242 avatar Feb 17 '24 01:02 jwhite242

@FrankD412, @bgunnar5, @jsemler Think this is finally ready for another pass/real review. An interesting question left (beyond any implementation issues/comments) is what to do about the spec. I refactored it to be a list so it's more clearly ordered for users, but maybe it'd make sense to contian this list in a subkey (priority_expressions or something) instead of at the root of the execution block? Don't have any other things in mind for this block yet, but thinking the key would be more future proof in case we do think of something. (i know docs are slightly out of sync, pending this subkey/not question)

jwhite242 avatar Mar 14 '24 03:03 jwhite242

actually, just had another thought that might fit nicer, expanding it and making the value more of a 'oneOf' type, so either value or expression, making it more clear that there's two types and avoiding having to do greedy parsing on things to figure it out on our end

execution:
  priority:
    - name:
      description: # optional, but encouraged... can make built-ins dump the code's internal description in the reserialized spec
      value: # use this for built-ins with string keys to select (e.g. current 'step-order')
    - name:
      description:
      expression: # use this for the eventual string based expression compilation

jwhite242 avatar Mar 14 '24 04:03 jwhite242

actually, just had another thought that might fit nicer, expanding it and making the value more of a 'oneOf' type, so either value or expression, making it more clear that there's two types and avoiding having to do greedy parsing on things to figure it out on our end

execution:
  priority:
    - name:
      description: # optional, but encouraged... can make built-ins dump the code's internal description in the reserialized spec
      value: # use this for built-ins with string keys to select (e.g. current 'step-order')
    - name:
      description:
      expression: # use this for the eventual string based expression compilation

Continued tweaking/iteration with a mind toward this being amenable to a mix of built-in/user things (think plugins for reusable expressions checkable via the dependencies machinery)

execution:
  priority:
    - prioritizer_id:  step_order  # built-in dag traversal order method 
      args:
         - step_order: 'depth-first'
         - ...
      - prioritizer_id: expression  # the built in expression prioritizer
        expression: step.procs*step.walltime ....
        

Think of the prioritizer_id (or similar name) as akin to the key used to id script adapters, so we can use this to tag plugin installed things in a way that makes error messaging helpful since it's a standardized place to register these functions. So for the sharing, maestro can tell the recipient that they're missing this plugin somebody was using. Still like the idea of descriptions too, though not sure about making them mandatory given the built-in ones can have that set internally and just serialized to the spec in the workspace

jwhite242 avatar Mar 28 '24 02:03 jwhite242