zenml [FEATURE]: Allow fetching from Repository by name or type

Contact Details [Optional]

Describe the feature you'd like

This is tangentially related to #726

Currently, when you want to fetch items from a Repository you need to use strings to identify the keys. I would like to be able to use the pipeline and step objects to fetch views from the repository.

In general, we'd like to limit how much we rely on string naming agreement. This allows refactoring tools to do a better job but also means you never have to copy paste strings since your editor can auto-complete pipeline and step definition names.

e.g.

@step
def my_step():
  ...

@pipeline
def my_pipeline(my_step):
  ...

p = my_pipeline(my_step())
p.run()

repo = Repository()

I'd like the following to work

pipeline_view = repo.get_pipeline(p)
pipeline_view = repo.get_pipeline(my_pipeline)
pipeline_view = repo.get_pipeline("my_pipeline")

step_view = pv.steps[my_step]
s = p.steps[...]
step_view = pv.steps[s]

Is your feature request related to a problem?

Sort of, we're anticipating maintenance problems as we scale and step and pipeline names might change over time but we wouldn't be able to catch all usages in notebooks. In our own codebase, we'd like to enforce never using string names in a custom flake8 code style plugin.

Imaging we've refactored myproj.pipelines.my_fancy_pipeline.my_fancy_pipeline -> myproj.pipelines.my_awesome_pipeline import my_awesome_pipeline

the following code would fail with an import error:

from  myproj.pipelines.my_fancy_pipeline import my_fancy_pipeline
# ^^^ module not found

repo = Repository()
repo.get_pipelines(my_fancy_pipeline)

How do you solve your current problem with the current status-quo of ZenML?

We've written the following utility code

PipelineName = Union[str, BasePipeline, BasePipelineMeta]


@beta
def get_last_run(pipeline: PipelineName) -> PipelineRunView:
    """Hacky convenience method to fetch the last run of a pipeline.

    Notes
    -----
    This is inherently racy, but it's currently the only way to fetch the pipeline run details.
    This approach is also what's currently suggested in the docs.

    See Also
    --------
    https://github.com/zenml-io/zenml/issues/726

    """
    repo = Repository()

    return repo.get_pipeline(pipeline_name=to_pipeline_name(pipeline)).runs[-1]


@beta
def to_pipeline_name(pipeline: PipelineName) -> str:
    """Get a pipeline's name.

    This also supports custom named pipelines via `@pipeline(name="blah")`
    """
    if isinstance(pipeline, str):
        # treat `pipeline` as a pipeline name string
        return pipeline
    if isinstance(pipeline, BasePipeline):
        # `pipeline` is a connected instance, use its `name` property
        return pipeline.name
    if isinstance(pipeline, BasePipelineMeta):
        # `pipeline` is a decorated function.
        # This code duplicates code in zenml.pipelines.BasePipeline.__init__
        return pipeline.__name__
    else:
        raise TypeError(type(pipeline))

and pytest

@pipeline
def my_pipeline():
    ...


@pipeline(name="blah blah")
def my_custom_named_pipeline():
    ...


def test_to_pipeline_name():
    p = my_pipeline()
    name = "my_pipeline"

    assert_that(to_pipeline_name(name)).is_equal_to(name)
    assert_that(to_pipeline_name(my_pipeline)).is_equal_to(name)
    assert_that(to_pipeline_name(p)).is_equal_to(name)


def test_to_pipeline_name__with_custom_name():
    p = my_custom_named_pipeline()
    name = "blah blah"

    assert_that(to_pipeline_name(name)).is_equal_to(name)
    assert_that(to_pipeline_name(my_custom_named_pipeline)).is_equal_to(name)
    assert_that(to_pipeline_name(p)).is_equal_to(name)


def test_to_pipeline_name__with_unknown_type():
    with pytest.raises(TypeError) as e:
        # noinspection PyTypeChecker
        to_pipeline_name([])
    assert_that(e.value.args[0]).is_equal_to(list)

Any other comments?

No response

Jun 26 '22 00:06 strangemonad

Hi @strangemonad, I'll have a look at this one and get back to you if any questions arise.

Jun 27 '22 13:06 AlexejPenner

I have made some experimental changes to the repository, you can find the draft here - this is work in progress and I hope to get it into our next release in two weeks. Feel free to take a testdrive and let me know if its the behaviour you were looking for.

Jun 28 '22 12:06 AlexejPenner