yocto-gl
yocto-gl copied to clipboard
CLI command suggestion after running the pipeline
Signed-off-by: Hubert Zub [email protected]
What changes are proposed in this pull request?
After running the pipeline with a given profile
and step
parameters, those parameters are saved in run tags. On the run details page, this data is being utilized to suggest a CLI command that recreates the same execution.

How is this patch tested?
I've done manual tests on both full runs (without defined step
) and on runs with a named step (e.g. -s train
).
Does this PR change the documentation?
- [ ] No. You can skip the rest of this section.
- [ ] Yes. Make sure the changed pages / sections render correctly by following the steps below.
- Check the status of the
ci/circleci: build_doc
check. If it's successful, proceed to the next step, otherwise fix it. - Click
Details
on the right to open the job page of CircleCI. - Click the
Artifacts
tab. - Click
docs/build/html/index.html
. - Find the changed pages / sections and make sure they render correctly.
Release Notes
Is this a user-facing change?
- [ ] No. You can skip the rest of this section.
- [x] Yes. Give a description of this change to be included in the release notes for MLflow users.
Display CLI command in the tracking UI that reproduces given ML Pipelines run.
What component(s), interfaces, languages, and integrations does this PR affect?
Components
- [ ]
area/artifacts
: Artifact stores and artifact logging - [ ]
area/build
: Build and test infrastructure for MLflow - [ ]
area/docs
: MLflow documentation pages - [ ]
area/examples
: Example code - [ ]
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ]
area/models
: MLmodel format, model serialization/deserialization, flavors - [x]
area/pipelines
: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates - [ ]
area/projects
: MLproject format, project running backends - [ ]
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs - [ ]
area/server-infra
: MLflow Tracking server backend - [ ]
area/tracking
: Tracking Service, tracking client APIs, autologging
Interface
- [x]
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ]
area/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ]
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ]
area/windows
: Windows support
Language
- [ ]
language/r
: R APIs and clients - [ ]
language/java
: Java APIs and clients - [ ]
language/new
: Proposals for new client languages
Integrations
- [ ]
integrations/azure
: Azure and Azure ML integrations - [ ]
integrations/sagemaker
: SageMaker integrations - [ ]
integrations/databricks
: Databricks integrations
How should the PR be classified in the release notes? Choose one:
- [ ]
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" section - [ ]
rn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section - [x]
rn/feature
- A new user-facing feature worth mentioning in the release notes - [ ]
rn/bug-fix
- A user-facing bug fix worth mentioning in the release notes - [ ]
rn/documentation
- A user-facing documentation change worth mentioning in the release notes
Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via
mlp run --profile PROFILE
.
Makes sense, however I'm not sure if there was no other intention here. @sunishsheth2009 what do you think?
Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via
mlp run --profile PROFILE
.
Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?
Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via
mlp run --profile PROFILE
.Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?
For reproducibility, it would suffice to include git clone <pipeline_repo>
and git checkout <commit>
commands before the mlflow pipelines run
invocation. It would be more convenient syntactically if mlflow pipelines run
supported passing in a repo URL and commit as commandline flags.
Can we actually guarantee reproducing the run by showing a command with recorded profile name and step name? What if users changed the content of the profile in between? If we just want to show a tip for running the pipeline, I'd prefer to not recording the step but simply showing how to run the entire pipeline via
mlp run --profile PROFILE
.Don't you think @jinzhang21 that just using the profile context would have the same problem of users changing the underlying profile? Or even the profile-name for that matter? Step names for now are static. For future it would be good to show step name as well once users can create their own steps which have a unique step name. What do you think?
For reproducibility, it would suffice to include
git clone <pipeline_repo>
andgit checkout <commit>
commands before themlflow pipelines run
invocation. It would be more convenient syntactically ifmlflow pipelines run
supported passing in a repo URL and commit as commandline flags.
All great recommendations. Makes sense to me. One more thing on the step recorded here:
Instead of recording the step user asked to run, we might want to record the actual last step executed successfully. Imagine a partial execution p.run("evaluate")
that succeeds in train
but fails in evaluate
(for whatever reason), do we want to show a command that guarantees to fail in evaluate
again? --> nvm. We decided to reproduce a failed run too.