dependabot-core icon indicating copy to clipboard operation
dependabot-core copied to clipboard

python private registry settings with replaces-base and path prefixes are ignored

Open bewinsnw opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Package ecosystem

pip

Package manager version

24.0

Language version

python 3.9

Manifest location and content before the Dependabot update

No response

dependabot.yml content

version: 2
registries:
  python-artificatory:
    type: python-index
    url: https://artifactory.internal/artifactory/api/pypi/pypi/simple/
    replaces-base: true
updates:
  - package-ecosystem: "pip"
    # dependabot will not run without this
    # https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#insecure-external-code-execution
    insecure-external-code-execution: allow
    directory: "/"
    registries:
      - python-artificatory
    schedule:
      interval: "daily"

Updated dependency

urllib 1.26.18 to 2.2.1 (one example, but this is all libraries, see below)

What you expected to see, versus what you actually saw

What I see in the log is this

updater | 2024/05/18 10***57***21 INFO <job_829837253> Updating urllib3 from 1.26.18 to 2.2.1
  proxy | 2024/05/18 10***57***22 [075] GET https***//artifactory.internal***443/pypi/urllib3/json
  proxy | 2024/05/18 10***57***22 [075] 404 https***//artifactory.internal***443/pypi/urllib3/json
...
Dependabot encountered '5' error(s) during execution, please check the logs for more details.
+---------------------------------------------------+
|           Dependencies failed to update           |
+------------------+--------------------------------+
| types-cachetools | dependency_file_not_resolvable |
| boto3            | dependency_file_not_resolvable |
| cachetools       | dependency_file_not_resolvable |
| urllib3          | dependency_file_not_resolvable |
| botocore         | dependency_file_not_resolvable |
+------------------+--------------------------------+

The url that 404s should not be being fetched: the /pypi path does not exist on the configured private registry. The bug is this code: https://github.com/dependabot/dependabot-core/blob/main/python/lib/dependabot/python/update_checker.rb#L267-L276 which assumes that libraries are always on pypi, which is not true in the presence of private registries, and should not even be checked when replaces-base is enabled. Because of replaces-base, the url for the dependency got changed to the host of the private registry but the path fetched is incorrect.

The private registry in this case only supports Simple (PEP 503) fetches, as used by latest_version_finder in dependabot. But the api call it makes here is for a Simple-json (PEP 691) index file. So even if the path was corrected to include the correct base prefix, this would still 404.

The correct thing to do would be for this code to just leverage latest_version_finder which already has the logic needed instead of making its own registry lookups; all we need it to do is to check if the index file exists. If it was doing that, the log would look like this:

updater | 2024/05/18 10***57***21 INFO <job_829837253> Updating urllib3 from 1.26.18 to 2.2.1
  proxy | 2024/05/18 10***57***22 [075] GET https***//artifactory.internal***443/artifactory/api/pypi/pypi/simple/urllib3
/
  proxy | 2024/05/18 10***57***22 [075] 200 GET https***//artifactory.internal***443/artifactory/api/pypi/pypi/simple/urllib3/

Native package manager behavior

The native package managers don't make PEP 691 requests without content-negotiation, see https://peps.python.org/pep-0691/#content-types - and they also respect the path prefix for index-urls.

Images of the diff or a link to the PR, issue, or logs

No response

Smallest manifest that reproduces the issue

No response

bewinsnw avatar May 18 '24 12:05 bewinsnw

Updating my description of this, https://github.com/dependabot/dependabot-core/blob/main/python/lib/dependabot/python/update_checker.rb#L267-L276 this call isn't to PEP-691, it's trying to use the Warehouse JSON api https://warehouse.pypa.io/api-reference/json.html - I think this is only supported by pypi and warehouse?

Anyway, given that dependabot seems to require a very limited subset of this (https://github.com/dependabot/dependabot-core/blob/55febaf5a8a5f3c4025c2aed5634811b3fa61064/python/lib/dependabot/python/metadata_finder.rb#L19-L47) I'm going to hack my repo to serve up those bits...

bewinsnw avatar May 30 '24 12:05 bewinsnw