kedro
kedro copied to clipboard
Make rich optional / not a core dependency of `kedro`
Description
What do you think about pip uninstall rich to fallback to a plain log? Especially for CI environment there’s not much point to have it. @noklam
Joel Schwarzmann I think thats a good idea you can even do the rich__repr that only works if it’s avaialble
Documenting some conversation in Slack - we should consider making rich
non-core of Kedro. Potentially
While we have fixed many issues in this Tweaking Logging to improve UX, particular we introduce RichHandler
. However, there are still some unresolved issues with rich
. Some of them requires upstream fix instead of kedro.
List of unresolved issues, mostly IDE(notebook, VSCode or PyCharm) specific
- [ ] #1752
- [x] #1719
- [ ] #1733
- [ ] https://github.com/kedro-org/kedro/issues/3276
rich
is mostly useful for interactive development workflow. In most production environment, teminal logs are not colored and rich
automatically fallback.
Context
While Rich logging improve the readability, it has been proved hard to make it works 100% the case, there are edge cases in IDE/deployment environment which is out of our control. In some case, if the logging is too long rich
automatically split it into multiple lines and cause issue.
In addition, some user prefer plain logging (it's configurable, but user will still have to install rich)
Pro:
- Provide a simple way to fallback and reduce the burden for Kedro team to solve specific Rich issues (i.e.
pip uninstall rich
) - Make
rich
optional to reduce Kedro's core dependencies
Con:
- Not sure - more development work? How many people disable rich today and is this needed?
[!NOTE]
This is not our top priority now, but we would love to have feedback and more discussion before we move to technical design or implementation.
Alternative
- kedro-rich plugin (enable by
pip install kedro-rich
, not core dependencies) - https://github.com/datajoely/kedro-rich
Would be great to move this to a kedro-rich
plugin, yes.
More broadly, configuring our logging is consistently a pain. This is hardly Kedro's fault, stdlib logging is famously obtuse, but our heavy handed approach to logging exacerbates this problem. Just from this week, a user is confused about logs or lack of them in Kedro, Docker, Airflow https://linen-slack.kedro.org/t/15975835/version-1-disable-existing-loggers-true-formatters-simple-fo#602b2164-1a4f-4e5c-8741-3c3dc29f7be7
An approach worth exploring is moving our logging to structlog https://www.structlog.org/en/stable/configuration.html (which has progressive enhancement if rich is present) and even consider making the library functions more logger-agnostic by passing it as a parameter in key places.
rich created some problems in older versions of Databricks https://linen-slack.kedro.org/t/16022133/situation-kedro-0-18-8-python-3-8-16-databricks-9-1-lts-we-a#e098da1b-ef71-4ead-9044-d2e4dfe097d9
Possible solutions:
- Put
rich.pretty.install
behind atry: import rich
and makerich
optional, hencepip install kedro[rich]
- Put
rich.pretty.install
inside akedro-rich
plugin - ❓
@datajoely 's kedro-rich prototype keeps receiving attention https://github.com/datajoely/kedro-rich/issues/11
Unsure if this is a parent issue, because the linked issues are consequences of having rich as mandatory dependency, not a breakdown of tasks needed to get rid of it. I'm renaming accordingly.
Yeah aligned on the fact we should look to make the dependency optional
Longer term I'm still keen to advocate for the other tasks still on the roadmap such as progress bars: https://github.com/kedro-org/kedro/milestone/15
And to do the drop in click replacement (which we could do a try/except import) https://github.com/ewels/rich-click
Commenting here to clarify my view on this issue since it got mentioned in a meeting.
Initially, my idea was that, for now, we should not change what users get when doing pip install kedro
. Doing so could be considered a breaking change from the user perspective IMHO.
However, let's say we want to make the Kedro work even after doing pip uninstall rich
. If we don't change the project.dependencies
array, this will make pip
and other tools complain:
❯ pip show kedro
Name: kedro
Version: 0.19.5
Summary: Kedro helps you build production-ready data and analytics pipelines
Home-page:
Author: Kedro
Author-email:
License: Apache Software License (Apache 2.0)
Location: /private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages
Requires: attrs, build, cachetools, click, cookiecutter, dynaconf, fsspec, gitpython, importlib-metadata, importlib-resources, more-itertools, omegaconf, parse, pluggy, pre-commit-hooks, PyYAML, rich, rope, toml
Required-by:
❯ pip uninstall rich
Found existing installation: rich 13.7.1
Uninstalling rich-13.7.1:
Would remove:
/private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages/rich-13.7.1.dist-info/*
/private/tmp/test-kedro-rich/.venv/lib/python3.11/site-packages/rich/*
Proceed (Y/n)? y
Successfully uninstalled rich-13.7.1
❯ pip check
cookiecutter 2.6.0 requires rich, which is not installed.
kedro 0.19.5 requires rich, which is not installed.
And, perhaps most importantly, even if we make kedro
not depend on rich... It still depends transitively on it through cookiecutter
.
So, I see 2 main blockers:
- As long as
kedro -> cookiecutter
, then this issue cannot be closed becausekedro -> cookiecutter -> rich
. - Even if we address (1), it's not clear what the best solution to the
rich
dependency is.
Thoughts @noklam @SajidAlamQB @lrcouto ?
This is a good point, but I think it's a separate discussion from what we were talking.
- Make
kedro
(kedro run
specifically ) doesn't depends onrich
at runtime (this PR or the discussion above). i.e. "Can I run kedro without importingrich
at all" - Make
rich
not a core dependency ofkedro
(including pip install etc). i.e. "Can I run kedro without ever installingrich
"
Should we focus on 1. first, which is what this PR mainly addressing? So far I think the bigger problem coming from rich
isn't because of rich
is big or it comes with lots of dependencies, the main complain is it fails in certain platforms and there is no good way to disable it automatically (and user do not have control over it).
The 2. is still a problem, but in my mind a less urgent one until we have a good replacement of cookiecutter. Alternatively we can immediately pin cookiecutter
<2.3.0 which doesn't depends on rich
at all. (kedro has a pin of cookiecutter >2.1.1 <3.0 atm).
Agreed to focus on (1) first 👍🏼
The PR in question is #3838 by the way
@lrcouto is this completely addressed now or still follow up to do?
@lrcouto is this completely addressed now or still follow up to do?
There's still follow-up to do. What I implemented here was a first step, but it still requires the user to manually downgrade their cookiecutter
version to get Kedro to work without rich
. They'll also have to manually uninstall rich
.
To make rich
completely optional, we'll also have to deal with our dependency on cookiecutter
for project creation.
@lrcouto in that case, would you prefer to reopen this issue or creates new one for the follow up tasks you mentioned?
I'd say let's reopen this issue to signal it's not fully addressed, and we make it a parent of the new one about cookiecutter.
I'd say let's reopen this issue to signal it's not fully addressed, and we make it a parent of the new one about cookiecutter.
Yeah, I agree.
We will be closing this issue for now, based on this discussion on Kedro's dependencies. We have decided on a longer term approach to deal with Cookiecutter, and by extension Rich, that will take some time to design and implement. This issue will be addressed once we're ongoing with this process.