poetry icon indicating copy to clipboard operation
poetry copied to clipboard

Support subprojects in a poetry project

Open abn opened this issue 5 years ago • 43 comments

Background & Rationale

This request is inspired by RPM Package Manger’s capability to build subpackages from the same Spec File.

Here, I want to propose and discuss replication a version of this capability can be replicated within poetry to allow for simplified user experience for a python project maintainer, especially when either maintaining namespace packages and/or multi-project source trees. While strict project separation is a good thing in most cases, it might not always be the more pragmatic scenario for package maintainers.

For our purposes here, we can refer to each of theses packages as a subproject. And all subprojects are managed under a single poetry project. This means that there is only a single pyproject.toml file and a shared project root directory with either a shared source tree or independent source trees (subdirectory) for each subproject.

Description

Let us consider the scenario of multiple namespace packages being maintained in a single repository with the following structure.

    namespace-project/
    └── src
        └── namespace
            └── package
                ├── one
                │   └── __init__.py
                ├── three
                │   └── __init__.py
                └── two
                    └── __init__.py

Note that this will still apply even if different source directories exists within the root directory for each subproject.

Here the intention could be that we want to distribute 3 packages, namely, namespace-package-one , namespace-package-two and namespace-package-three.

For the purpose of this example, let us assume that namespace-package-three depends on namespace-package-one. The pyproject.toml file could look something like this.

New sections are annotated with comments detailing them and expected behaviour.

[build-system]
requires = ["poetry-core>=1.0"]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
name = "namespace-package"
version = "1.0.0-alpha.0"
description = ""
authors = [
    "Bender Rodriguez <[email protected]>"
]
license = "MIT"
readme = "README.md"
repository = "https://git.planetexpress.com/bender/python-namespace-package"
keywords = []
classifiers = [
    "Intended Audience :: Developers",
    "Operating System :: OS Independent",
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python :: 3 :: Only",
    "Programming Language :: Python :: 3.8",
]

# this section remains as is, but now specifies shared dependencies
[tool.poetry.dependencies]
python = "^3.8"

[tool.poetry.dev-dependencies]
pre-commit = "^2.1"
flake8 = "^3.7"
black = "^19.10b0"
pytest = "^5.2"

# the following are package specific section
[tool.poetry.packages.one]
name = "namespace-package-one"  # this is optional as name would be derrived from <project.name>-<package name from section>
description = ""  # this will overide the description from the project for this package
readme = "README.one.md"  # this will overide the readme from the project for this package
packages = [  # this is mandatory for sub-packages
    # any package not included in a sub-package is added to the base package (ie. "namespace-package")
    # if the "packages" property is not explicitly configured in the base
    { include = "namespace.package.one", from = "src" }
]

[tool.poetry.packages.one.dependencies]
ujson = "^1.35"

[tool.poetry.packages.one.dev-dependencies]
pytest-mock = "^2.0"

[tool.poetry.packages.two]
packages = [ 
    { include = "namespace.package.two", from = "src" }
]

[tool.poetry.packages.two.dependencies]
psycopg2 = "^2.8.4"

[tool.poetry.packages.two.dev-dependencies]
pytest-postgresql = "^2.3.0"

[tool.poetry.packages.three]
requires = [ # this enables us to specify the relationships between sub-packages
    "one" # this could also be namespace-package-one
]
packages = [ 
    { include = "namespace.package.two", from = "src" }
]

[tool.poetry.packages.three.dependencies]
aiohttp = "^3.5"

[tool.poetry.packages.three.dev-dependencies]
beautifulsoup4 = "^4.8"
aioresponses = "^0.6"
pytest-asyncio = "^0.10"

Under this scenario, the following might be what the cli commands look like. Current behaviour will remain unaltered as these are additive changes.

$ poetry add --package one <dependency>
.. <similar to current add output>

$ poetry packages list
namespace-package-one
namespace-package-two
namespace-package-three

$ poetry build
<builds all three packages>

$ poetry build --package one
<builds only namespace-package-one>

$ poetry publish --dry-run
...
Publishing namespace-package-one (1.0.0-alpha.0) to PyPI
  - Uploading namespace-package-one-1.0.0-alpha.0.tar.gz
  - Uploading namespace-package-one-1.0.0-alpha.0-py3-none-any.whl

Publishing namespace-package-two (1.0.0-alpha.0) to PyPI
  - Uploading namespace-package-two-1.0.0-alpha.0.tar.gz
  - Uploading namespace-package-two-1.0.0-alpha.0-py3-none-any.whl

Publishing namespace-package-three (1.0.0-alpha.0) to PyPI
  - Uploading namespace-package-three-1.0.0-alpha.0.tar.gz
  - Uploading namespace-package-three-1.0.0-alpha.0-py3-none-any.whl

Variations

The above is an initial though of how it might work. That said there are variations to this that should be discussed.

  1. Does a per-package dev-dependnecy section make sense? This only really makes sense if we want to allow for developing a single package at a time. However, this will become tricky in cases like here where "three" depends on "one". This will mean that when developing "three", dev dependencies for "one" should also be installed. If isolation is required, then multiple virtual environments will be required, which might be overkill for majority use cases for this feature.

  2. Will all packages be installed under PEP-0517? Is it even possible to install only specific package when being installed under PEP-0517? One possible solution might be to make use of "extras" here as a way of specifying which package if any to install, but default to all.

Extensions

  1. Optional Project Package As an extension to this, one might also want to optionally distribute a a namespace only package namespace-package (let's call this the "project package" for now) that installs the core dependencies and also allow for "extras" as we do today without requiring the distribution of the entire source tree with the binary distribution.

This means that if someone does pip install namespace-package, the maintainer might expect the the following to be installed:

  1. The namespace namepace.package.
  2. Packages namespace-package-one and namespace-package-three, which are required for the "default" install.

An end-user can also install the remaining package, like so - pip install namespace-package[two] which simply will install a dependency namespace-package-two.

This behaviour might not be desired in all cases, and can be considered opt-in.

abn avatar Apr 05 '20 19:04 abn

I recently went through converting over a mono repo with several packages over to poetry, and thought it might be useful to share what we did, and pain points and bug work arounds. Although also recognizing this proposal would hopefully make it all obsolete :-) Still this might provide some utility to those who want to do mono repos prior to native support in poetry.

first a few context/caveats, we don't use namespace packages vs a common prefix, and our fs layout is little different. that's non material to the techniques used, but perhaps relevant to the proposal.

main_pkg
tools/
   pkg_1
   pkg_2
   pkg_3
   ...  

at the moment all the packages under tools have dependencies on the main package declared as a path based dev-dependency.

[tool.poetry.dev-dependencies]
# setup in tree as a dev dependency                                                                                                                                                                                                                                                                                                                  
c7n = {path = "../..", develop = true}

i attempted to resolve it as a normal dependency caused a few issues with poetry build (issues #2046, partial fix #2047, also reported/pr by others).

so using as a dev dependency worked but also meant not using poetry directly as a build/publish tool to work around those issues and still needed the injection of the main_pkg as a regular project dep when publishing. we ended up using poetry metadata/api to generate setup/requirements for that purpose, converting dev dependencies to regular dependencies in the process. https://github.com/cloud-custodian/cloud-custodian/blob/master/tools/dev/poetrypkg.py#L121

unrelated to multi-project, but to the generation workaround, we ran into another issue that in that the masonry sdist builder didn't really support markdown readmes (pr #1994)

for handling ergonomics simplicity around multiple commands that needed to update versions/ or release, we added in makefile targets to frontend,

pkg-update:
	poetry update
	for pkg in $(PKG_SET); do cd $$pkg && poetry update && cd ../..; done

One interesting consequence of source directory dependencies in poetry is that it break any attempts to distribute/publish a package even if they are dev deps. ie. per the pyproject.toml spec is that via the build-system PEP, poetry will be invoked during install. The invocation/installation of poetry as a build sys is transparently handled by pip. Simple resolution/parse of pyproject.toml dev dependencies will cause a poetry failure for an source distribution install, as installation of an sdist, is actually a wheel compilation.

As a result of this as a publishing limitation we only publish wheels instead of sdists which avoids the build system entirely, as a wheel is extractable installation container/format file.

we're also maintaining compatibility with tox/setuptools ecosystem for compatibility with developer workflows, there's a few more details on what we did here https://cloudcustodian.io/docs/developer/packaging.html

kapilt avatar Apr 18 '20 09:04 kapilt

@kapilt thank you writing that up. It is extremely useful and insightful.

abn avatar Apr 18 '20 09:04 abn

This proposal is valuable. As it is, poetry supports optional dependencies, but not optional packages

  • https://python-poetry.org/docs/pyproject/#packages

The use of optional packages for a namespace project is really useful. :+1: for including the optional-package as part of this proposal.

dazza-codes avatar Jul 10 '20 19:07 dazza-codes

shared dependencies are very useful, but might make sense to inherit some of the logic from Maven regarding the shared block:

  1. Allow definitions of dependencies and versions in a shared block
  2. only pull them into the package if that dependency name is explicitly used in the dependency. In this way we can define standard versions for certain dependencies across all packages, but not require all packages to install those packages at those versions. (Can be overriden in the package depenendency block).

while it does complicate things the benefits are:

  1. No unneeded dependencies in modules of a multi-module project.
  2. When multiple but not all packages have the same dependency, we can define the version once, but still explicitly pull the dep.
  3. Enabling overrides for versions for certain modules can be very useful and get people out of some hairy situations.

djerraballi avatar Oct 05 '20 06:10 djerraballi

This proposal is really valuable! I wonder what's the latest status of this? Is this currently being working on? I would love to devote some time to speed up the process if possible.

xinbinhuang avatar Jan 17 '21 21:01 xinbinhuang

are we there yet ?

patrickelectric avatar Jun 09 '21 13:06 patrickelectric

Any updates? Really see some value for this!!!

johnwalz97 avatar Jun 25 '21 05:06 johnwalz97

Unfortunately there is nothing new about this yet, but I found a monorepo manager called Bazel, which is widely used and supports many languages. If your goal is to work only with Python Pants Build might be an easier solution.

mrlucasrib avatar Jun 26 '21 02:06 mrlucasrib

Maybe we could have pyproject.toml in each of the subprojects. Then, add a poetry plugin that coordinates updating dependencies between the top-level pyproject.toml and the children. That might mean adding a setting in each of the pyproject.toml to say whether they are a parent or a child.

cognifloyd avatar Jul 23 '21 16:07 cognifloyd

Could the new dependency groups be leveraged in a way to achieve this proposal? I have some intuition it could, but not sure how

woile avatar Aug 01 '21 16:08 woile

Maybe we could have pyproject.toml in each of the subprojects. Then, add a poetry plugin that coordinates updating dependencies between the top-level pyproject.toml and the children. That might mean adding a setting in each of the pyproject.toml to say whether they are a parent or a child.

This is what Maven does. Here's an example on how Maven provides the capability to share modules between projects: https://www.baeldung.com/maven-multi-module

klDen avatar Nov 02 '21 02:11 klDen

Is it currently possible to have something like what yarn workspaces does?

So having pyproject.toml in each package but they can share dependencies from a root virtual environment.

I've been trying with a single pyproject.toml and using packages, optional and extras but it becomes very handheld - would be easier to define dependencies in the subpackages in their own pyproject.toml (I hope it makes sense)

NixBiks avatar Dec 17 '21 14:12 NixBiks

Have you tried to add a path dependency to another project/folder, which contains its own pyproject.toml?

So for example;

[tool.poetry.dependencies]
subproject = {path = "subproject", develop = true}

fredrikaverpil avatar Dec 21 '21 20:12 fredrikaverpil

Have you tried to add a path dependency to another project/folder, which contains its own pyproject.toml?

So for example;

[tool.poetry.dependencies]
subproject = {path = "subproject", develop = true}

Yes that'll install my root environment but the subproject still wants its own virtual environment so I end up with a virtual environment for each subproject plus one for the root. I want a single virtual environment to be used by all projects (and keeping a pyproject.toml for each project) - that is how yarn workspaces works AFAIK.

NixBiks avatar Dec 22 '21 08:12 NixBiks

@mr-bjerre I know it's not what you are asking for, but you could try symlinking the two virtual environments together.

If this is a requirement, I would probably go for another solution altogether. For example use pip-tools to manage deps (can use multiple input files, one from each project) and twine or flit to publish.

fredrikaverpil avatar Dec 22 '21 16:12 fredrikaverpil

As discussed on Discord, this would be of a huge help to our team. Any progress/time-estimate on implementation? Thank you.

AdamJel avatar May 16 '22 13:05 AdamJel

As discussed on Discord, this would be of a huge help to our team. Any progress/time-estimate on implementation? Thank you.

Right now team is focused on getting 1.2 released. This could be something to ship as next "big" feature (like groups and plugins in 1.2). However, right now there is no estimation on when this is gonna be added. This is also something that might be added as 3rd party plugin after 1.2 is released.

Secrus avatar May 16 '22 14:05 Secrus

Have you tried to add a path dependency to another project/folder, which contains its own pyproject.toml? So for example;

[tool.poetry.dependencies]
subproject = {path = "subproject", develop = true}

Yes that'll install my root environment but the subproject still wants its own virtual environment so I end up with a virtual environment for each subproject plus one for the root. I want a single virtual environment to be used by all projects (and keeping a pyproject.toml for each project) - that is how yarn workspaces works AFAIK.

You can do that by creating a local config (poetry.toml) for each of your sub-packages with virtualenvs.create false. You then mark the sub-packages as dependencies as suggested by @fredrikaverpil and include them as packages.

ljnsn avatar Oct 13 '22 08:10 ljnsn

Is there any way for me to also install my dev-dependencies using this central pyproject.toml ?

In the sense I get subproject = {path = "subproject", develop = true} installs my package But I also want the dev-dependencies of subproject to be installed

Note: I am talking about tool.poetry.group.dev.dependencies not tool.poetry.extras

AbdealiLoKo avatar Nov 15 '22 14:11 AbdealiLoKo

This is not a feature that currently exists. We likely will not support leaking dev-depdendencies over path relationships; the design discussed in this issue is using a super-pyproject.toml instead of linking individual projects together.

neersighted avatar Nov 15 '22 15:11 neersighted

We also wanted to have such feature because it will help us pin the right version for the right libraries. This is because some version of the dependencies may not work on some versions (cough cough numpy).

So, although we can duplicate all the dependencies in multiple projects and let it be, this could create a subtle portability hell with regards to interoperability on different versions of the same library. Without having a master project that defined all the dependencies, this will be quite difficult to manage to say the least. We have over 100 packages and we can't afford to manually inspect each other either.

It seems like Cargo did it pretty well by pinning the version on subproject members to depend on one master cargo.

wizpresso-steve-cy-fan avatar Dec 21 '22 07:12 wizpresso-steve-cy-fan

I've written a blogpost and demo repo where I demonstrate how poetry can (quite easily) be used in a mono repo with subpackages. Perhaps the utility scripts in it can help you. Blogpost: https://gerben-oostra.medium.com/python-poetry-mono-repo-without-limitations-dd63b47dc6b8 Repo: https://gitlab.com/gerbenoostra/poetry-monorepo/

gerbenoostra avatar Jan 09 '23 12:01 gerbenoostra

Hey folks, I've written up a proposal for monorepo support using path dependencies and dependency groups, all existing features of Poetry: https://github.com/python-poetry/poetry/issues/6850. There's an example repo at https://github.com/adriangb/python-monorepo/tree/main/poetry with more details.

The pattern is quite functional already, I've been using it in production for several months now. The only things I think are missing are:

  • #6845 for caching
  • A plugin to reduce the boilerplate of managing dependency groups, folders, extras, etc. I think this can happen outside of Poetry itself at least until it's stable.
  • A story for publishing wheels for subpackages with path dependencies (I think this is tangential to basic monorepo support and can be figured out separately (there's an open issue, I can try to find it later).

I'd like to understand what use cases that cover or doesn't and have folks who have tried this or similar things poke holes in the proposal to make sure it's solid.

adriangb avatar Feb 26 '23 17:02 adriangb

If you want https://github.com/python-poetry/poetry/issues/2270#issuecomment-1445417107 to happen (or have objections) please chime in on the linked issue. I see a total of 18 👍 or equivalent but sadly only one of you has chimed in on #6850

adriangb avatar Mar 04 '23 23:03 adriangb

fwiw, just an update to my previous comment, https://github.com/python-poetry/poetry/issues/2270#issuecomment-615809216 to support both mono repo and frozen wheels (version spec switch to ==version), I went ahead and moved to a poetry plugin (freeze) that also handles resolving path dev dependencies. it operates effectively as a post build tool / pre publish tool directly against the wheel. its pretty early (ie. functional, but no tests, cli options) but I'm hoping to get those flushed out so we can use it for prod releases against a mono repo this month. https://github.com/cloud-custodian/poetry-plugin-freeze

kapilt avatar Mar 06 '23 16:03 kapilt

Anything new on this? https://github.com/python-poetry/poetry/issues/2270#issuecomment-1445417107 Seems to nearly solve the problem despite the distribution (packaging) issue

MateoSaezMata avatar Oct 25 '23 18:10 MateoSaezMata

Anything new on this? #2270 (comment) Seems to nearly solve the problem despite the distribution (packaging) issue

Not sure. Just coming across all of this for the first time. So I am looking forward to it!

luketych avatar Feb 02 '24 02:02 luketych

I've been testing out some monorepo approaches and started with @adriangb's approach here. DX is the main issue -- challenges with this approach surround poetry run from a subproject - as mentioned here.

poetry.lock and .venv seem to be the the main painpoints here -- and the workarounds mentioned involve keeping poetry.lock in sync (or .gitignored). Custom scripts have been implemented in a variety of solutions including @gerbenoostra's here to accommodate the extra lockfiles.

It would be nice to:

  • Support poetry run (or poetry subproject run if defined as a plugin) within a subproject, without requiring a lockfile or venv in the subproject
  • Support poetry add (or poetry subproject add) within a subproject, without requiring a lockfile or venv within the subproject -- this command would update the parent lockfile
  • Support poetry remove (similar to add)

Ideally, poetry or the plugin could find the root lockfile/pyproject.toml, or there could be some way that the developer specifies it. This would lead to a similar experience to cargo, yarn, and npm.

davidroeca avatar Feb 10 '24 02:02 davidroeca

I think a plugin would solve all of those issues and should be doable. I haven’t written one just because the DX isn’t bad enough for me to justify spending time on it. And I usually don’t end up running poetry … from a subproject, most things happen from the top level Makefile.

adriangb avatar Feb 10 '24 02:02 adriangb

@adriangb what is your solution for replacing path dependencies with regular ones when publishing? So far, I'm using @gerbenoostra sh script. I wonder if there is any poetry plugin support for this. I also checked https://github.com/DavidVujic/poetry-multiproject-plugin but don't think that approach works for the path rewrite use case. Cfr this issue.

tnielens avatar Mar 08 '24 14:03 tnielens