pip icon indicating copy to clipboard operation
pip copied to clipboard

Added `--scoped-index-url` command line option

Open ffissore opened this issue 2 years ago • 8 comments

It allows pip users to ensure that the specified package names (or prefixes of package names) are downloaded exclusively from the specified indexes. Example syntax: pip install companyname-package-name==1.0.0 --scoped-index-url companyname:http://url/of/private/index

See #8606

ffissore avatar Jun 05 '22 14:06 ffissore

@pfmoore and @uranusjr since you were involved in #8606 and #10964 you might want to take a look at this Pls consider that my knowledge of the pip code base is very limited: if you like the approach but require more changes (test coverage or anything), let me know and I'll do as much as I can

Tests are failing: I'll fix them ASAP

ffissore avatar Jun 05 '22 14:06 ffissore

This would need tests and documentation as well, but don’t rush into those. At the moment there is not any sort of consensus that this feature is a good idea, or that it addresses the underlying issues that have been raised.

Before we’re going to accept a PR you’ll need to get support for it as a feature. And personally I’m far from convinced this is the right solution here. Apart from anything else it seems like it would encourage companies to try to reserve parts of the package namespace for themselves. It also frames the feature as only relevant for corporate users, and I’m strongly of the opinion that pip should ensure that it’s feature set is equally welcoming to all users, corporate, open source or even individuals.

And of course it’s easy to use a proxying mirror like https://github.com/uranusjr/simpleindex to implement these rules, so you need to explain why this solution is superior to that. As someone who prefers composable components over making pip ever more complex, I’m going to be hard to convince on that point…

Thanks for the PR - it’s much easier to have a discussion based on a concrete proposal. I suggest you now concentrate on putting it into context, explaining how it addresses the various issues that have been raised, and getting support for this approach as opposed to any of the others that have been suggested.

pfmoore avatar Jun 05 '22 16:06 pfmoore

Thank you @pfmoore for your reply: I'll do my homework.

One question: how/where do I find supporters? Should I ask pip maintainers? On a mailing list? Should I ask the community to upvote this PR?

ffissore avatar Jun 05 '22 17:06 ffissore

Honestly that’s the hard part, and I don’t have good answers. For a start I would consider that none of the participants in #8606 where you first suggested this. Understanding why that is, and getting a consensus for your proposal on that thread, would be a good place to start.

pfmoore avatar Jun 05 '22 18:06 pfmoore

Personally I think I like the scoping approach; this is similar to namsapcing support on PyPI, but instead of doing it in the index (i.e. “no-one can publish packages on other indexes”), we do it entirely in the client (i.e. “please ignore packages published on other indexes”). This still leaves a security issue (it’s much too easy to forget the scoping argument), but can be a good intermediate step before index namespacing can be done (which is much more difficult).

I am uncertain to adding a new option; it is too difficult to make sense of its interaction between the existing --index-url and --extra-index-url. It is probably better if we could invent some sort of syntax to implement scoping directly on those options instead. But like Paul said, this needs some design work, and then a ton of tests, so let’s not rush it.

I would suggest opening an issue to write up the design, including how you intend to make the option do, and how it should interact with other options related to package-selecting logic.

uranusjr avatar Jun 06 '22 04:06 uranusjr

Converting this to draft since it is not close to mergable.

uranusjr avatar Jun 06 '22 06:06 uranusjr

I posted a comment on #8606: I copy-paste it here to let the conversation happen close to the code

How it works

Example usage:

pip install companyname-my-package==1.0.0 --scoped-index-url companyname:http://url/of/private/index

pip will install companyname-my-package from the index located at http://url/of/private/index because its name starts with companyname.

pip install package-name==1.0.0 --scoped-index-url package-name:http://url/of/private/index

pip will install package-name from the index located at http://url/of/private/index because its name matches the one specified with the new param.

Comparisons with other techniques

These are some alternative approaches available today (collected from the conversation at #8606).

  1. Maintain a local mirror/proxy of pypi, that gives precedence to your libraries (devpi, simpleindex, and probably jfrog and sonatype)
  2. Store into your private index the dependencies of your private libraries, then split the requirements files, and then run something like:
pip install -r public-requirements.txt
pip install -r private-requirements.txt --index-url=https://my.private.index/
  1. Name squat your private library names on pypi.org
  2. Switch to conda

I think the scoping option is superior for the following reasons (one for each of the alternatives above):

  1. Setting up a pypi mirror/proxy is non-trivial, and it's basically about making your continuous integration tool talk with an external server. Depending on your infrastructure, this might be more or less complex.
  2. Storing the dependencies of your libraries into your private index requires maintaining them (in particular keeping them up-to-date).
  3. Name squatting means that your private library names become public to the world. It may also lead to disputes, as pypi.org maintainers wish to have a clean index (but this is just my opinion)
  4. Switching to conda requires retraining your team

Advantages of the proposed approach

  1. It's a brand new feature, not a change to an existing one. Thus it's backwards compatible.
  2. The code change is minimal: if you don't count tests and the glue code necessary to carry the value of the param down to the code that uses it, it's ~10 lines of code.
  3. The new param can be added to your requirements files: devs won't have to remember it or be trained before benefitting from it.
  4. For those familiar with nodejs, this approach looks familiar, as it indeed mimics the behaviour of npm scopes ^1

On a side note (and I admit this is wishful thinking), allowing python devs to more easily and safely code private libraries sets the basics for sharing them with the world as open-source code.

What to do now

I was explicitly asked to find supporters for my proposal. Please, share your support or your disagreement in any way you prefer: up and downvotes, comments, counter-proposals...

ffissore avatar Jun 15 '22 20:06 ffissore

I think this idea is in the right direction.

PEP 423, although currently deferred, has top level namespaces, whose ownership can belong to organizations, which can be companies.

So configuring namespace COMPANY to only resolve packages from COMPANY-INDEX-URL would be pretty natural.

I would expect this feature to be mostly set in the pip.conf configuration file as default rather than from the command line.

eaaltonen avatar Jun 21 '22 12:06 eaaltonen

Going to go ahead and close this, since it has merge conflicts that haven't been resolved in a while. Please feel welcome to file a new PR or to reopen and update this one!

pradyunsg avatar Sep 30 '22 13:09 pradyunsg