kedro icon indicating copy to clipboard operation
kedro copied to clipboard

Make rich optional / not a core dependency of `kedro`

Open noklam opened this issue 1 year ago • 15 comments

Description

What do you think about pip uninstall rich to fallback to a plain log? Especially for CI environment there’s not much point to have it. @noklam

Joel Schwarzmann I think thats a good idea you can even do the rich__repr that only works if it’s avaialble

Documenting some conversation in Slack - we should consider making rich non-core of Kedro. Potentially

While we have fixed many issues in this Tweaking Logging to improve UX, particular we introduce RichHandler. However, there are still some unresolved issues with rich. Some of them requires upstream fix instead of kedro.

List of unresolved issues, mostly IDE(notebook, VSCode or PyCharm) specific

  • [ ] #1752
  • [x] #1719
  • [ ] #1733
  • [ ] https://github.com/kedro-org/kedro/issues/3276

rich is mostly useful for interactive development workflow. In most production environment, teminal logs are not colored and rich automatically fallback.

Context

While Rich logging improve the readability, it has been proved hard to make it works 100% the case, there are edge cases in IDE/deployment environment which is out of our control. In some case, if the logging is too long rich automatically split it into multiple lines and cause issue.

In addition, some user prefer plain logging (it's configurable, but user will still have to install rich)

Pro:

  • Provide a simple way to fallback and reduce the burden for Kedro team to solve specific Rich issues (i.e. pip uninstall rich)
  • Make rich optional to reduce Kedro's core dependencies

Con:

  • Not sure - more development work? How many people disable rich today and is this needed?

[!NOTE]
This is not our top priority now, but we would love to have feedback and more discussion before we move to technical design or implementation.

Alternative

  • kedro-rich plugin (enable by pip install kedro-rich, not core dependencies) - https://github.com/datajoely/kedro-rich

noklam avatar Aug 15 '23 13:08 noklam

Would be great to move this to a kedro-rich plugin, yes.

More broadly, configuring our logging is consistently a pain. This is hardly Kedro's fault, stdlib logging is famously obtuse, but our heavy handed approach to logging exacerbates this problem. Just from this week, a user is confused about logs or lack of them in Kedro, Docker, Airflow https://linen-slack.kedro.org/t/15975835/version-1-disable-existing-loggers-true-formatters-simple-fo#602b2164-1a4f-4e5c-8741-3c3dc29f7be7

An approach worth exploring is moving our logging to structlog https://www.structlog.org/en/stable/configuration.html (which has progressive enhancement if rich is present) and even consider making the library functions more logger-agnostic by passing it as a parameter in key places.

astrojuanlu avatar Oct 25 '23 07:10 astrojuanlu

rich created some problems in older versions of Databricks https://linen-slack.kedro.org/t/16022133/situation-kedro-0-18-8-python-3-8-16-databricks-9-1-lts-we-a#e098da1b-ef71-4ead-9044-d2e4dfe097d9

Possible solutions:

  • Put rich.pretty.install behind a try: import rich and make rich optional, hence pip install kedro[rich]
  • Put rich.pretty.install inside a kedro-rich plugin

astrojuanlu avatar Nov 01 '23 20:11 astrojuanlu

@datajoely 's kedro-rich prototype keeps receiving attention https://github.com/datajoely/kedro-rich/issues/11

astrojuanlu avatar Nov 20 '23 09:11 astrojuanlu

Unsure if this is a parent issue, because the linked issues are consequences of having rich as mandatory dependency, not a breakdown of tasks needed to get rid of it. I'm renaming accordingly.

astrojuanlu avatar Nov 20 '23 09:11 astrojuanlu

Yeah aligned on the fact we should look to make the dependency optional

Longer term I'm still keen to advocate for the other tasks still on the roadmap such as progress bars: https://github.com/kedro-org/kedro/milestone/15

datajoely avatar Nov 20 '23 10:11 datajoely

And to do the drop in click replacement (which we could do a try/except import) https://github.com/ewels/rich-click

datajoely avatar Nov 20 '23 10:11 datajoely

Commenting here to clarify my view on this issue since it got mentioned in a meeting.

Initially, my idea was that, for now, we should not change what users get when doing pip install kedro. Doing so could be considered a breaking change from the user perspective IMHO.

However, let's say we want to make the Kedro work even after doing pip uninstall rich. If we don't change the project.dependencies array, this will make pip and other tools complain:

❯ pip show kedro
Name: kedro
Version: 0.19.5
Summary: Kedro helps you build production-ready data and analytics pipelines
Home-page: 
Author: Kedro
Author-email: 
License: Apache Software License (Apache 2.0)
Location: /private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages
Requires: attrs, build, cachetools, click, cookiecutter, dynaconf, fsspec, gitpython, importlib-metadata, importlib-resources, more-itertools, omegaconf, parse, pluggy, pre-commit-hooks, PyYAML, rich, rope, toml
Required-by: 
❯ pip uninstall rich
Found existing installation: rich 13.7.1
Uninstalling rich-13.7.1:
  Would remove:
    /private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages/rich-13.7.1.dist-info/*
    /private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages/rich/*
Proceed (Y/n)? y
  Successfully uninstalled rich-13.7.1
❯ pip check
cookiecutter 2.6.0 requires rich, which is not installed.
kedro 0.19.5 requires rich, which is not installed.

And, perhaps most importantly, even if we make kedro not depend on rich... It still depends transitively on it through cookiecutter.

So, I see 2 main blockers:

  1. As long as kedro -> cookiecutter, then this issue cannot be closed because kedro -> cookiecutter -> rich.
  2. Even if we address (1), it's not clear what the best solution to the rich dependency is.

Thoughts @noklam @SajidAlamQB @lrcouto ?

astrojuanlu avatar May 03 '24 15:05 astrojuanlu

This is a good point, but I think it's a separate discussion from what we were talking.

  1. Make kedro (kedro run specifically ) doesn't depends on rich at runtime (this PR or the discussion above). i.e. "Can I run kedro without importing rich at all"
  2. Make rich not a core dependency of kedro (including pip install etc). i.e. "Can I run kedro without ever installing rich"

Should we focus on 1. first, which is what this PR mainly addressing? So far I think the bigger problem coming from rich isn't because of rich is big or it comes with lots of dependencies, the main complain is it fails in certain platforms and there is no good way to disable it automatically (and user do not have control over it).

The 2. is still a problem, but in my mind a less urgent one until we have a good replacement of cookiecutter. Alternatively we can immediately pin cookiecutter <2.3.0 which doesn't depends on rich at all. (kedro has a pin of cookiecutter >2.1.1 <3.0 atm).

noklam avatar May 03 '24 16:05 noklam

Agreed to focus on (1) first 👍🏼

astrojuanlu avatar May 03 '24 16:05 astrojuanlu

The PR in question is #3838 by the way

astrojuanlu avatar May 03 '24 16:05 astrojuanlu

@lrcouto is this completely addressed now or still follow up to do?

noklam avatar May 21 '24 17:05 noklam

@lrcouto is this completely addressed now or still follow up to do?

There's still follow-up to do. What I implemented here was a first step, but it still requires the user to manually downgrade their cookiecutter version to get Kedro to work without rich. They'll also have to manually uninstall rich.

To make rich completely optional, we'll also have to deal with our dependency on cookiecutter for project creation.

lrcouto avatar May 21 '24 17:05 lrcouto

@lrcouto in that case, would you prefer to reopen this issue or creates new one for the follow up tasks you mentioned?

noklam avatar May 22 '24 00:05 noklam

I'd say let's reopen this issue to signal it's not fully addressed, and we make it a parent of the new one about cookiecutter.

astrojuanlu avatar May 22 '24 06:05 astrojuanlu

I'd say let's reopen this issue to signal it's not fully addressed, and we make it a parent of the new one about cookiecutter.

Yeah, I agree.

lrcouto avatar May 22 '24 12:05 lrcouto

We will be closing this issue for now, based on this discussion on Kedro's dependencies. We have decided on a longer term approach to deal with Cookiecutter, and by extension Rich, that will take some time to design and implement. This issue will be addressed once we're ongoing with this process.

lrcouto avatar Jul 08 '24 06:07 lrcouto