yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

CLI command suggestion after running the pipeline

Open hubertzub-db opened this issue 2 years ago • 4 comments

Signed-off-by: Hubert Zub [email protected]

What changes are proposed in this pull request?

After running the pipeline with a given profile and step parameters, those parameters are saved in run tags. On the run details page, this data is being utilized to suggest a CLI command that recreates the same execution.

Screen Shot 2022-08-05 at 11 10 44

How is this patch tested?

I've done manual tests on both full runs (without defined step) and on runs with a named step (e.g. -s train).

Does this PR change the documentation?

  • [ ] No. You can skip the rest of this section.
  • [ ] Yes. Make sure the changed pages / sections render correctly by following the steps below.
  1. Check the status of the ci/circleci: build_doc check. If it's successful, proceed to the next step, otherwise fix it.
  2. Click Details on the right to open the job page of CircleCI.
  3. Click the Artifacts tab.
  4. Click docs/build/html/index.html.
  5. Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

  • [ ] No. You can skip the rest of this section.
  • [x] Yes. Give a description of this change to be included in the release notes for MLflow users.

Display CLI command in the tracking UI that reproduces given ML Pipelines run.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [x] area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • [x] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

Language

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

Integrations

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • [ ] rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • [ ] rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • [x] rn/feature - A new user-facing feature worth mentioning in the release notes
  • [ ] rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • [ ] rn/documentation - A user-facing documentation change worth mentioning in the release notes

hubertzub-db avatar Aug 01 '22 11:08 hubertzub-db

Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via mlp run --profile PROFILE.

Makes sense, however I'm not sure if there was no other intention here. @sunishsheth2009 what do you think?

hubertzub-db avatar Aug 02 '22 19:08 hubertzub-db

Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via mlp run --profile PROFILE.

Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?

sunishsheth2009 avatar Aug 02 '22 23:08 sunishsheth2009

Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via mlp run --profile PROFILE.

Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?

For reproducibility, it would suffice to include git clone <pipeline_repo> and git checkout <commit> commands before the mlflow pipelines run invocation. It would be more convenient syntactically if mlflow pipelines run supported passing in a repo URL and commit as commandline flags.

dbczumar avatar Aug 03 '22 05:08 dbczumar

Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via mlp run --profile PROFILE.

Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?

For reproducibility, it would suffice to include git clone <pipeline_repo> and git checkout <commit> commands before the mlflow pipelines run invocation. It would be more convenient syntactically if mlflow pipelines run supported passing in a repo URL and commit as commandline flags.

All great recommendations. Makes sense to me. One more thing on the step recorded here: Instead of recording the step user asked to run, we might want to record the actual last step executed successfully. Imagine a partial execution p.run("evaluate") that succeeds in train but fails in evaluate (for whatever reason), do we want to show a command that guarantees to fail in evaluate again? --> nvm. We decided to reproduce a failed run too.

jinzhang21 avatar Aug 04 '22 18:08 jinzhang21