zenml icon indicating copy to clipboard operation
zenml copied to clipboard

Add Active Stack as part of configuration YAML

Open kabinja opened this issue 1 year ago • 8 comments

Describe changes

Issue #2261: Add Active Stack as part of configuration YAML

I added the field active_stack to PipelineRunConfiguration to be able to set up an active stack during the duration of the run. I also added a stack_context manager to allow resetting the stack at the end of the execution of the pipeline. This was done to reset the initial active stack at the end of the execution of the pipeline if one is defined by active_stack in the configuration YAML.

Pre-requisites

Please ensure you have done the following:

  • [X] I have read the CONTRIBUTING.md document.
  • [see comments] If my change requires a change to docs, I have updated the documentation accordingly.
  • [X] I have added tests to cover my changes.
  • [X] I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • [N/A] If my changes require changes to the dashboard, these changes are communicated/requested.

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [X] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Other (add details above)

Comments

I did not find the section where the configuration file usage is referenced in the documentation.

Summary by CodeRabbit

  • New Features
    • Introduced the ability to specify an active stack in the pipeline run configuration, enhancing flexibility in pipeline execution.
  • Improvements
    • Added a context handler for managing stack contexts, allowing for more dynamic stack management during pipeline operations.
  • Tests
    • Implemented a new integration test to verify the functionality of running pipelines with different stacks specified in the configuration file.

kabinja avatar Jan 29 '24 14:01 kabinja

[!IMPORTANT]

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Walkthrough

The recent updates introduce the capability to specify an active stack for individual pipeline runs within ZenML. This is facilitated through a new attribute in the PipelineRunConfiguration class and enhanced pipeline functionality to utilize a specified stack context during build and run operations. A context manager ensures the original stack is restored after execution. Integration tests are expanded to verify this new feature, ensuring pipelines can run with different stacks as defined in their configuration files.

Changes

File Path Change Summary
.../config/pipeline_run_configuration.py Added active_stack attribute of type Optional[str] to PipelineRunConfiguration class.
.../new/pipelines/pipeline.py Imported stack_context, updated build and _run methods to use stack_context, and added _update_stack_from_config method.
.../stack/utils.py Added Type import and stack_context class for handling stack context.
tests/integration/functional/cli/test_pipeline.py Added test_pipeline_run_with_different_stack_in_config_file to verify running pipelines with a specified stack in the configuration.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai[bot] avatar Jan 29 '24 14:01 coderabbitai[bot]

@coderabbitai review

htahir1 avatar Jan 29 '24 14:01 htahir1

This is great so far but needs changes in the docs, specifically somewhere here: https://docs.zenml.io/user-guide/advanced-guide/pipelining-features/configure-steps-pipelines#breaking-configuration-down

I added the missing documentation taking the build field as a reference.

kabinja avatar Feb 04 '24 19:02 kabinja

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Hey @kabinja, I see that you already fixed the issue with the __exit__ method, however, could you please use the context manager that I mentioned in my comment above?

Sorry, I did not have time to go through all the comments this weekend. I changed to use the existing context manager. Would it make sense to move the context manager to the stack.utils module to avoid having issues with cyclic dependencies? I was also wondering why not using enter and exit instead of the yield in a try-catch block?

kabinja avatar Feb 26 '24 20:02 kabinja

Hey @kabinja, thank you for applying the changes.

Yes, it would be great if you could move the context manager from the current file to a destination such as src/zenml/utils/stack_utils.py.

As for the __enter__ and __exit__ methods vs. the decorator, both approaches serve the same purpose. When it is a relatively simple process, IMO the decorator is a bit more readable, whereas for complicated cases I usually prefer to implement a context manager using the __enter__, __exit__ methods.

bcdurak avatar Mar 13 '24 12:03 bcdurak

@bcdurak Should we maybe move on this? @kabinja just wanted send a soft ping to make sure we don't lose this thread

htahir1 avatar May 23 '24 05:05 htahir1

@kabinja im so sorry for the delay here. We talked internally about this PR and it still needs some work:

  • it should not be called active_stack but stack
  • It doesn't consider running from the server where we need to check this isn't set
  • You enter temporary_active_stack with None, which means it does nothing. So the code just changes the stack, not temporarily for the duration of the pipeline run

What this feature should be is probably this: when a stack is specified in the yaml, it uses that stack for the pipeline run, but doesny modify your "active stack" that you've set locally:

  • I have the default stack active locally (-> zenml stack get will output default)
  • I run a pipeline, and in the config yaml I specify stack: kubeflow. This pipeline should run on the kubeflow stack
  • After the pipeline run, zenml stack get should still be default IMO. This is not what his PR does

Can you take a look at this and see if you can work on it? If not feel free to close it and we an pick it up again soon. Sorry for the staleness on our side!

htahir1 avatar Jul 09 '24 10:07 htahir1