Improve mamba unsatisfiability error messages
Abstract
Using all_problem_structured, we can improve mamba errors when an environment is not satisfiable.
The function gives a number of problem and dependencies between packages involved in the error.
We can build a Directed Acyclic Graph (DAG) of these packages, where top level (root) packages are directly requested by the user, and bottom packages (leaves) are posing problem (conflict, missing dependency...).
Together with @syslaila we successfully build two proof of concepts (POC) for such errors:
- POC-A: One pinpoints the root and leaf packages to produce a succinct error message #1847
- POC-B: The other retrace the whole path between these root and leaves to show a detailed explanation of the conflicts/problems in a tree-like message AntoinePrv/mamba-error-reporting.
The goal is to incorporate both of them, using a/ as a summary and b/ as the verbose details.
The work done is quite substantial, and currently the two POCs do not share the same foundation / data structures. The aim of this issue is to find similarities between the two POC and break down the work into small, well-defined pull-requests (PRs).
Tasks
Please add unit tests where needed. Python bindings are encouraged when appropriate.
Problem data
- [ ] T11a:
PackageInforepresents (almost) all the available information we have on a package, but to use it in data structure we need to make it comparable and hashable. - [ ] T11b: Alternatively,
PackageInfodoes loose thesolvable_idused by libsolv. If we need to store it, we could.- [ ] T11b.1 Build a
std::map<SolvableID, PackageInfo>and use the ids everywhere else in the algorithm (no need for 1a. anymore).
- [ ] T11b.1 Build a
- [x] T12: The information about dependencies only comes as a
std::string(e.g.python>3.6.*), however in POC-B we do need to extract the package name from the range.- [x] T12.1: Create a simple
DependencyInfostruct withnameandrangeattributes - [x] T12.2: Create a parsing constructor, (e.g. regex
\s*(\w[\w-]*)\s*(.*)\s*)
- [x] T12.1: Create a simple
- [ ] T13a: Same as T11a for
DependencyInfo - [ ] T13b: Alternatively, same as T11b for
DependencyInfo - [x] T14 Parse libsolv data from
all_problem_structured- [x] T14.1: Filter invalid
SOLVER_RULE_JOB - [x] T14.2: Build package dependency graph
- [x] T14.3: [POC-B] Store pair of packages that are in conflict TODO: define where? In the graph or in another data structure)
- [x] T14.4: [POC-B] Store packages that are problematic (missing dependency etc) TODO: define where? In the graph or in another data structure
- [x] T14.1: Filter invalid
Graph data structure
A directed graph data structure already exists in graph_utils.hpp, we should extend it to fit our needs.
Having a standalone graph class will improve testing and separation of concerns.
- [x] T21: Extend the graph class to hold data on each edge
- [x] T22: Add structure to store and query predecessors
- [x] T23: Add missing graph exploration algorithms:
find_leavesfind_roots - [ ] T24: [POC-B] Add minimal undirected graph class (?)
- [ ] T25: [POC-B] Add heuristic clique partition algorithm
Graph compression
- [x] T31: Add data structure to represent a set of packages
- [x] T31.1: Likely around a lightweight
std::vector, removing duplicates - [x] T31.2: Add accessor for getting the name of the package
- [x] T31.3: Add accessor for getting the all the versions (?)
- [x] T31.4: Add a versions repr method to handle joining and truncation
- [x] T31.1: Likely around a lightweight
- [x] T32: Same as T31 but for dependencies
- [x] T33: Build the compatilbity graph for each packages with the same name
- [x] T33.1: Define criterion for merging packages
- [x] T33.2: Compress graph using clique partition algorithm
- [x] T33.3: Same as T14.3 but for set of packages
- [x] T33.4: Same as T14.4 but for set of packages
Summary error message [POC-A]
- [ ] T41: Create a function to return the summary message (as a
std::stringorstd::sstream)- [ ] T41.1: TODO: define steps
Tree error message [POC-B]
- [x] T51: Create an enum for the type of node (root, visited, leaf...)
- [x] T52: Create a structure for node meta-information (depth, last in children, status)
- [ ] T53a: Implement an edited Depth First Search (DFS) algorithm (nodes with same dep_id must
be visited one after another)
- [x] T53a.1: Dynamically add split nodes
- [ ] T53a.2: Dynamically merge visited children in split nodes
- [ ] T53b: Extend the DFS algorithm to have the felxibility required
- [ ] T53b.1: Add node selection/ordering callback
- [ ] T53c: Convert graph to hybrid graph for reusing the default DFS
- [ ] T54: Create a function/struct/visitor to convert nodes into English messages
- [ ] T55: Create a function to return the summary message (as a
std::stringorstd::sstream)
Functional test
- [x] T61: Reuse all tests case from AntoinePrv/mamba-error-reporting
- [x] T61.1: Pin dependencies to avoid the tests failing in the future
- [ ] T61.2: Ensure the error messages do not fail
- [ ] T61.3: Check for certain keyword and packages names in the message
I've updated the explanations of the algo and the structure of the classes from my PR if it helps: https://github.com/mamba-org/mamba/pull/1847#issue-1333365845
Closing since message are already running with an experimental flag. Improvements can be tracked in separate issues.