fix: correct repodata_record.json metadata for explicit installs (#4095)
⚠️ Important Disclosure
I do not know C++. This PR was extremely heavily LLM-assisted (Claude in Cursor). As a result, please review this PR skeptically and critically. I have spent significantly more time than typical attempting to verify correctness to the extent I'm able.
Description
This PR fixes #4095: incorrect repodata_record.json precedence for explicit installs.
Background
Since v2.1.1, libmamba switched to "repodata-first" when constructing info/repodata_record.json (intended to honor channel repodata patches over info/index.json). However, for explicit installs (URL-based), the "repodata" object is a URL-derived stub containing placeholder/zero values—not real channel repodata. Making that stub authoritative caused repodata_record.json to be written with incomplete/zeroed fields.
Affected versions:
- v2.1.1–v2.3.2: Full corruption (all stub fields written)
- v2.3.3–v2.4.0: Partial fix via #4071 (
depends/constrainsfixed, butlicense,timestamp,build_number,track_featuresstill corrupted)
The v2.3.3 partial fix (#4071) removed empty depends/constrains arrays so they could be backfilled from index.json. However, this approach has two problems:
- Channel patches undone: Patches that intentionally set
depends/constrainsto[]are silently overwritten byindex.jsonvalues - Other fields still wrong:
build_number,license,timestamp, andtrack_featuresremained corrupted
Solution
This PR leverages the existing defaulted_keys field (originally introduced in PR #1120 for signature verification, but no longer populated since the 2023 libsolv wrapper refactor) to distinguish URL-derived stub values from authoritative solver/channel values:
-
URL-derived packages (
from_url()):defaulted_keyslists stub field names (e.g.,["_initialized", "license", "timestamp", ...]). These fields are erased before merging withindex.json, allowing correct values to be filled in. -
Solver-derived packages (
make_package_info()):defaulted_keys = ["_initialized"]only. The sentinel signals "trust ALL fields"—nothing is erased, preserving channel patches including intentionally empty arrays.
The _initialized sentinel enables fail-hard verification: write_repodata_record() throws std::logic_error if it's missing, catching any code path that fails to properly initialize defaulted_keys.
Additionally, this PR adds cache healing to detect and fix corrupted caches from v2.1.1–v2.4.0 using a corruption signature (timestamp == 0 AND license == "").
Review Recommendation
Please review commit-by-commit. Each commit has an extended commit message with:
- Detailed explanation of the changes
- Test status (assertions passed/failed)
- Rationale for design decisions
The commits follow TDD (Test-Driven Development) with alternating RED (failing tests) and GREEN (fix) commits:
| Commit | Type | Description |
|---|---|---|
| 01 | RED | Tests for defaulted_keys population in from_url() |
| 02 | GREEN | Populate defaulted_keys in from_url() and make_package_info() |
| 03 | RED | Tests for URL-derived metadata and channel patch preservation |
| 04 | GREEN | Use defaulted_keys in write_repodata_record() and constructor.cpp |
| 05 | RED | Test for cache healing |
| 06 | GREEN | Detect corrupted cache in has_valid_extracted_dir() |
| 07 | RED | Tests for consistent field presence and checksums |
| 08 | GREEN | Ensure consistent field presence and compute missing checksums |
Commits 07-08 add consistency improvements: ensuring depends/constrains are always present as arrays, omitting track_features when empty (matching conda behavior), and computing missing checksums from the tarball.
Type of Change
- [x] Bugfix
Checklist
- [x] My code follows the general style and conventions of the codebase, ensuring consistency
- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] My changes generate no new warnings
- [x] I have run
pre-commit run --alllocally in the source folder and confirmed that there are no linter errors. - [x] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing tests pass locally with my changes