mamba icon indicating copy to clipboard operation
mamba copied to clipboard

Incorrect `repodata_record.json` precedence for explicit installs

Open maresb opened this issue 1 month ago • 1 comments

This follows up on https://github.com/mamba-org/mamba/issues/4052 to indicate some problems that still remain.

Summary

Since v2.1.1, libmamba switched in https://github.com/mamba-org/mamba/pull/3901 to "repodata-first" when constructing info/repodata_record.json (intended to honor channel repodata patches over info/index.json). However, in "explicit installs", the repodata object is a URL-derived stub, not real channel repodata. Making that stub authoritative caused repodata_record.json to be written with incomplete/zeroed fields in v2.1.1 through v2.3.2.

In the v2.3.3 release, https://github.com/mamba-org/mamba/pull/4071 attempted to mitigate this by removing empty depends/constrains from the repodata object so they can be back-filled from index.json. However, this is applied unconditionally, based only on emptiness, with no check that the source actually came from an explicit install. As a result:

  1. Channel repodata patches that intentionally set depends/constrains to the empty list cannot be represented in the local repodata_record.json: the code erases empty lists and refills from index.json, silently undoing the patch.
  2. Some fields in explicit installs remain wrong in v2.3.3 (build_number, license, timestamp, and track_features), keeping the cache divergent from the best-known info.

This behavior corrupts the local cache metadata and propagates to tools that rely on repodata_record.json (e.g., conda-lock). Theoretically, this has the risk of leading to incomplete or incorrect lockfiles as in https://github.com/conda/conda-lock/pull/862 (summary). Thankfully, experimentation shows that https://github.com/mamba-org/mamba/pull/4071 was sufficient to resolve a serious bug involving disappearing dependencies, so this issue is far less critical than https://github.com/mamba-org/mamba/issues/4052.

Expected behavior

  • Normal channel installs: repodata_record.json should mirror the channel record actually used by the solver, including patches, even when a patch intentionally sets depends/constrains to []. No back-fill from index.json should override an explicit channel decision.
  • Explicit installs: writing repodata_record.json should not persist skeletal/placeholder values that zero out fields relative to either channel or artifact metadata.

Concrete example (PyYAML)

As a good baseline, v2.1.0 wrote a more complete repodata_record.json for this package:

{
    "arch": null,
    "build": "pyh7db6752_0",
    "build_number": 0,
    "build_string": "pyh7db6752_0",
    "channel": "conda-forge",
    "constrains": [],
    "depends": [],
    "fn": "pyyaml-6.0.3-pyh7db6752_0.conda",
    "license": "",
    "license_family": "MIT",
    "md5": "b12f41c0d7fb5ab81709fcc86579688f",
    "name": "pyyaml",
    "noarch": "python",
    "platform": null,
    "size": 45223,
    "subdir": "noarch",
    "timestamp": 0,
    "track_features": "",
    "url": "https://conda.anaconda.org/conda-forge/noarch/pyyaml-6.0.3-pyh7db6752_0.conda",
    "version": "6.0.3"
}

Since v2.3.3, here is the result for the same artifact after an explicit install:

{
    "arch": null,
    "build": "pyh7db6752_0",
    "build_number": 0,
    "build_string": "pyh7db6752_0",
    "channel": "conda-forge",
    "depends": [
        "python >=3.10.*",
        "yaml"
    ],
    "fn": "pyyaml-6.0.3-pyh7db6752_0.conda",
    "license": "",
    "license_family": "MIT",
    "md5": "b12f41c0d7fb5ab81709fcc86579688f",
    "name": "pyyaml",
    "noarch": "python",
    "platform": null,
    "size": 45223,
    "subdir": "noarch",
    "timestamp": 0,
    "track_features": "",
    "url": "https://conda.anaconda.org/conda-forge/noarch/pyyaml-6.0.3-pyh7db6752_0.conda",
    "version": "6.0.3"
}

Diff:

@@ -4,13 +4,12 @@
     "build_number": 0,
     "build_string": "pyh7db6752_0",
     "channel": "conda-forge",
-    "constrains": [],
     "depends": [
         "python >=3.10.*",
         "yaml"
     ],
     "fn": "pyyaml-6.0.3-pyh7db6752_0.conda",
-    "license": "MIT",
+    "license": "",
     "license_family": "MIT",
     "md5": "b12f41c0d7fb5ab81709fcc86579688f",
     "name": "pyyaml",
@@ -18,8 +17,8 @@
     "platform": null,
     "size": 45223,
     "subdir": "noarch",
-    "timestamp": 1758891992558,
-    "track_features": "pyyaml_no_compile",
+    "timestamp": 0,
+    "track_features": "",
     "url": "https://conda.anaconda.org/conda-forge/noarch/pyyaml-6.0.3-pyh7db6752_0.conda",
     "version": "6.0.3"
 }

Even after https://github.com/mamba-org/mamba/pull/4071, license, timestamp, and track_features are still clobbered in explicit installs, and constrains disappears. While not demonstrated by this diff, build_number is also set to zero.

maresb avatar Nov 07 '25 20:11 maresb