packaging Expose Marker tree structure in the API

I'd like to be able to use a library like packaging in Debian's dh-python, to parse environment markers, rather than having to maintain our own hacky parser.

The use-case is parsing .dist-info metadata from a built package, during the build, to generate Debian package dependencies based on the Python module's declared dependencies.

Similarly to #448, this means we're evaluating the markers for all supported environments, not just the current environment. So instead of a boolean evaluate(), we need to:

Determine whether this dependency is required at all (boolean).
Determine what python_version constraints are applied to the module, and emit appropriate Debian dependencies, as possible.

Examples:

Requires-Dist: foo; (os_name == 'posix') should result in a dependency on python3-foo.
Requires-Dist: bar; python_version < '3.5' should result in a dependency on python3-bar | python3 (>= 3.5)

This means evaluate() is insufficient, for our needs. I really need to be able to walk the tree.

Can we make more of the tree structure public API?

Jan 01 '22 15:01 stefanor

FWIW, if you know the supported set of environments, you can pass your own values for the environment, to evaluate the markers in those contexts.

For example: https://github.com/scipy/oldest-supported-numpy/blob/5df60dc307f869ec286e9070bd7b9782608c4e69/tests/test_dependencies.py#L38

Can we make more of the tree structure public API?

This might make sense; although I'd want to make sure that we don't increase the API surface area, if there's an alternative solution possible on your end.

Jan 01 '22 16:01 pradyunsg

(I'm aware that the example I've pointed to doesn't match your use case)

Jan 01 '22 16:01 pradyunsg

Yeah, I tried to imagine how I'd do something similar, generating all the combinations of supported environments and passing them to .evaluate(). It is an option. I imagine fixing all environment values except python-version and just iterating across minor python versions would probably be sufficient for our use-case. But then we'd have to infer, from a set of boolean results, which ranges of python versions are suitable.

I don't think that result would be much less hacky than our current regex-based parsing, which simply ignores any marker with more than 1 part.

Jan 01 '22 20:01 stefanor

So I'd like to second the request for exposing the parsed marker AST. Evaluating the marker expression with an environment is not an option in my use case, enforcing a-priori restrictions on markers. Since the tool I'm working on already makes extensive use of this package, I did the expedient, accessing _markers and inspecting the list with its comparison tuples. I'd prefer to use a public API. The three subclasses of Node would make a splendid foundation based on my experience.

Having said that, I ran into a typing bug in my code that traces back to Node.value being declared as Any. In practice, that attribute is a str for all three subclasses. Would you be amenable to a pull request that updates the type accordingly? Then again, I might want to wait a spell: #484 already declares value as str.

Jan 10 '22 18:01 apparebit