prefect Improve scheduling parameter values on a deployment

Opened from the Prefect Public Slack Community

lochristopherhy: Hello Prefecters. A question about Prefect 2.0 deployment. I have one scheduler flow that takes a subflow_key. This subflow_key is passed into a curried function that dynamically creates a new Prefect flow. There are about 30 (and growing) different subflows that can be created dynamically.

My problem: I would like to have a single-flow many-deployments (each deployment is associated with different types of subflows) setup Constraints: Because subflows are generated dynamically using the curried function, I can't separate subflows into parent flows and run prefect deployment build for each flow. What I've tried: 30 different deployment yaml files for 1 scheduler flow with 30 different combinations of subflow_keys and schedules. My question: Is there a DRYer way to achieve the same setup. Each deployment file is identical except for 2 lines (parameters and schedule)? What is Prefect engineering's current take on this single-flow many-deployments paradigm? Will this be achievable via a single Prefect CLI command in the future (maybe with arrays of parameters / schedule flags passed into prefect deployment build)?

jeff923: Good questions, Chris.

Deployments are getting some improvements. We are planning to work on options for more efficient multi-flow deployment builds soon. I would keep an eye on the https://github.com/PrefectHQ/prefect/pulls|PRs and make sure that will meet your needs.

I’ll also pass along this scenario.

I don’t know of a DRYer way at the moment.

jeff923: I heard back from Chris and I think he’s spot on in that your situation sounds like something we should consider addressing within the scheduler / parametrization side of things: “it’s hard to schedule changing parameter values on a deployment”. I’ll open an issue on GitHub so we can track it.

jeff923: <@ULVA73B9P> open “Improve changing parameter values on a deployment”

Original thread can be found here.

Aug 08 '22 15:08 marvin-robot

Screen Shot 2022-08-08 at 11 46 55 AM

Aug 08 '22 15:08 discdiver

Hello there, a small code example of what I mean by the "curried function that dynamically creates a new subflow".

What I like about this design pattern:

Assume I have hundreds of ML models with a sklearn-like API (so each estimator has different init params). I don't have to "pre-deploy" each model as it's own flow.
I just deploy run_fit_predict_flow and Prefect Orion "picks up" my individual ML model fit_predict flows everytime a new "model_key" gets passed into run_fit_predict_flow

What could make the Prefect experience better:

The ability to build many deployments for one flow (in this case many different model_keys for the single run_fit_predict_flow flow) using a single line CLI command.

Possible CLI signatures:

prefect deployment build flow.py:flow --param $param1 --param $param2 --cron $schedule1 --cron $schedule2
prefect deployment build flow.py:flow --file $path-to-some-yaml-file

params:
- {"model_key": "linear_regression"}
- {"model_key": "logistic_regression"}
cron:
- 1 0 * * *
- 45 23 * * 6

def build_flow(name: str, model_cls: Type[MLModel], default_params: BaseModel):
    @flow(name=f"fit_predict:{name}")
    def fit_predict(X: pd.DataFrame, y: pd.Series, params: default_params = default_params()) -> pd.Series:
        y_pred = model_cls(params).fit_predict(X, y)
        return y_pred 
    return fit_predict

@flow(name="run_fit_predict_flow")
def run_fit_predict_flow(model_key: str, X: pd.DataFrame, y: pd.Series) -> pd.Series:
  model_cls = MODEL_CALLABLES.get(model_key)
  default_params: Type[BaseModel] = model_cls.get_param()  # A classmethod that returns the ML model's init params as a BaseModel
  y_pred = build_flow(model_key, model_cls, default_params)
  return y_pred

Aug 09 '22 06:08 topher-lo

FYI #6697 introduces a --param and --params cli option to prefect deployment build

Sep 17 '22 02:09 tekumara