CEP XXXX: `MatchSpec` minilanguage
Checklist for submitter
- [X] I am submitting a new CEP: MatchSpec minilanguage.
- [X] I am using the CEP template by creating a copy
cep-0000.mdnamedcep-XXXX.mdin the root level.
- [X] I am using the CEP template by creating a copy
- [ ] I am submitting modifications to CEP XX.
- [ ] Something else: (add your description here).
Checklist for CEP approvals
- [ ] The vote period has ended and the vote has passed the necessary quorum and approval thresholds.
- [ ] A new CEP number has been minted. Usually, this is
${greatest-number-in-main} + 1. - [ ] The
cep-XXXX.mdfile has been renamed accordingly. - [ ] The
# CEP XXXX -header has been edited accordingly. - [ ] The CEP status in the table has been changed to approved.
- [ ] The last modification date in the table has been updated accordingly.
- [ ] The
pre-commitchecks are passing.
Closes #80
I'm seeing myself referring to the "MatchSpec" interface in other CEPs yet this is not standardized, so there we go. Let's open that can of worms.
This will probably need another CEP on PackageRecord, which will probably ask for Repodata counterparts and... channel structure. Yay. I like how packaging.python.org does this btw. I'll probably copy some of that structure.
Not sure about the current status of this CEP, but before moving forward with it, we should maybe consider finalizing this one if we think it could be of interest?
@baszalmstra, @beckermr, @AntoinePrv, @ruben-arts, @JeanChristopheMorinPerso, I've tackled some of the pending items if you want to take a look. Perhaps more critically, the version strings and ordering conversation is now part of https://github.com/conda/ceps/pull/132.
I think I'll rewrite part of the Specification so we don't lose time with historical details and go straight for the syntax, since it's all intertwined anyway... This is valid 🤦:
>>> str(MS("channel:namespace:pkg 1 2[subdir=linux-63,channel=XX,name=jaime]"))
'XX/linux-63::pkg==1=2'
Finally got this up on GitHub. If anyone's interested, I started a collection of strings that conda currently does and does not accept as arguments to the MatchSpec constructor.
That repo also contains a Lark EBNF-type grammar for MatchSpec. Currently very incomplete and/or broken, but happy to continue refining it and contributing to the conda org once it's more mature.
@chenghlee shared this gem yesterday 😂 😭
>>> from conda.models.match_spec import MatchSpec
>>> MatchSpec("foo")
MatchSpec("foo")
>>> MatchSpec("foo=")
MatchSpec("foo")
>>> MatchSpec("foo # comment")
MatchSpec("foo")
>>> MatchSpec("foo=# comment")
MatchSpec("foo")
>>> MatchSpec("foo # comment")
MatchSpec("foo")
>>> MatchSpec("foo= # comment")
Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/conda/models/version.py", line 44, in __call__
return cls._cache_[arg]
~~~~~~~~~~~^^^^^
...
conda.exceptions.InvalidMatchSpec: Invalid spec 'foo= # comment': Invalid version '=': invalid operator
We also re-discovered that pkg=version[key=value](optional=True) is a valid spec according to conda's parser, but we really want to deprecate those parentheses.
There's also this gem:
>>> MatchSpec('foo >=1,<2')
MatchSpec("foo[version='>=1,<2']")
>>> MatchSpec('foo >=1, <2')
MatchSpec("foo[version='>=1,<2']")
>>> MatchSpec('foo >=1, < 2')
MatchSpec("foo[version='>=1,<2']")
>>> MatchSpec('foo >=1, < 2')
MatchSpec("foo[version='>=1,<2']")
>>> MatchSpec('foo >=1, < 2')
Traceback (most recent call last):
File "/Users/clee/Applications/miniconda3/envs/matchspec-grammar/lib/python3.13/site-packages/conda/models/version.py", line 44, in __call__
return cls._cache_[arg]
~~~~~~~~~~~^^^^^
KeyError: '>=1,<'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/clee/Applications/miniconda3/envs/matchspec-grammar/lib/python3.13/site-packages/conda/models/version.py", line 44, in __call__
return cls._cache_[arg]
~~~~~~~~~~~^^^^^
KeyError: '<'
...
conda.exceptions.InvalidMatchSpec: Invalid spec 'foo >=1, < 2': Invalid version '<': invalid operator
>>> MatchSpec('foo >=1, < 2,!=3')
MatchSpec("foo[version='>=1,<2,!=3']")
How conda handles whitespaces in MatchSpecs is....inconsistent. Having played around with it, I'm now inclined to disallow whitespace within each "subspec" (package name, version, build string).
Perhaps we can keep the general logic of the language and then add a collection section of "known allowed exception to the previous rules" with everything that is a strong candidate for deprecation (in a future CEP).
We should write only what we want in the CEP now. The deprecation of unsupported syntax is a separate issue to manage directly on conda.