rs-versions
rs-versions copied to clipboard
Versions cannot contain spaces
I had a user report of an error with parsing a version:
- https://github.com/russellbanks/Komac/issues/446
- https://github.com/russellbanks/Komac/issues/447
Both of these are due to the same issue of being unable to parse a version with spaces in. The version in question looks like 22.01 ZS v1.5.5 R2 (See winget-pkgs/manifests/m/mcmilk/7zip-zstd). It works correctly if the spaces are replaced with a delimiter, such as -.
Versioning::from_str("22.01 ZS v1.5.5 R2").unwrap(); // IllegalVersioning("22.01 ZS v1.5.5 R2")
Spaces! My lord what has the world come to.
Okay I've looked at the associated issues on your repository, as well as the 7zip releases page. It seems that the title of the releases uses spaces, but the git tag has proper - deliminations. I gather your tool is pulling from their releases via the title, then? Could you fill me in with some of the background information about how these version numbers are gathered and how they're used for further processing?
Accepting spaces in version numbers (even in the Mess type) may add some serious complexity, especially since the parsing behaviour of all three subtypes (SemVer, Version, and Mess) propagate up to the parent, Versioning. If I allow spaces, then the line between what's a version and what isn't (given say a line of words) becomes very blurry.
My tool allows users to add or update applications to winget (Microsoft's package manager for Windows). Each application has some metadata files and in those there is a package version (Example). The package versions at winget-pkgs/manifests/m/mcmilk/7zip-zstd are already pre-existing versions for 7zip-zstd in winget. This is where my tool gets the versions from (and also a new version from the user if they're adding a new version). Because Microsoft has very little limitations on what is a valid version, my tool needs to be able to parse and order any version of any app that is submitted to winget so that I can know which metadata file is the latest to build upon.
In Microsoft's schema for this, they actually just have a regex for validating the version:
"PackageVersion": {
"type": "string",
"pattern": "^[^\\\\/:\\*\\?\"<>\\|\\x01-\\x1f]+$",
"maxLength": 128,
"description": "The package version"
},
In short, the regex just says it can't contain some specific control characters and beyond that, what constitutes a version can be anything, and winget will attempt to order it regardless.
The following test passes:
#[test]
fn mess_7zip() {
cmp_messes("22.01-ZS-v1.5.5-R2", "22.01-ZS-v1.5.6-R2");
}
fn cmp_messes(a: &str, b: &str) {
let x = Mess::new(a).unwrap();
let y = Mess::new(b).unwrap();
assert!(x < y, "{} < {}", x, y);
}
But so does:
cmp_messes("22.01-ZS-v1.5.5-R2", "24.02-ZS-v1.6.0");
Since even with the - injected in, these versions are deep into lawlessness seemingly mixing calver, semver, and other noise, and so must be parsed as a Mess. Note that here, the calver section basically dominates the comparison, but this may be sufficient for your purposes for comparing at least this particular edge case.
I think it's best to remain firm that whitespaces shouldn't be accepted in version numbers, even mostly lawless ones as we see here. If they are present, I think it's reasonable to do some preprocessing in downstream code before parsing.
Preprocessing is unfeasible in my case as I take in a version straight from a clap parameter and serialize it into Yaml. Even if I were to do some changes to the version, I need it to be the exact same as what the user inputted to what value is serialized in the end. If it were more feasible, it would also add a lot of unnecessary complexity just to workaround this issue of mess not working with spaces.
I'd really appreciate it if you could add support for spaces - it would arguably make this library more robust with the intention of being able to parse any messy version and still be able to compare it.
Let's discuss this. Were I to support spaces in Mess, there are two breaking changes that come to mind:
- The addition of a variant in the Sep enum. These kinds of changes, while technically breaking, almost certainly affect no one but me. I doubt anyone is actually pattern matching on
Sepanywhere in real code. - The behaviour of the exposed
Mess::parsenom parser. Since this could technically be plugged into any downstream parser combination. A sentence likeHello 1.2.3 Rustthat expected the structureword <> version <> wordwould now fail, with theRustnow being parsed as part of the version. Similar to the above: is anybody actually doing this anywhere? I don't know, but it might not be impossible to individually check: https://crates.io/crates/versions/reverse_dependencies
- The addition of a variant in the Sep enum.
I do think it's unlikely that anyone is matching over Sep. If anyone were, they could either use _ when matching or add the space seperator to their match. However, I've checked the 18 packages in the reverse dependencies of this crate on crates.io by searching the code on their respective GitHub repositories for versions:: and none of them import or match Sep.
- The behaviour of the exposed
Mess::parsenom parser.
Again, from checking the reverse dependencies, no one appears to be using this. The usages I saw were mostly Versioning, SemVer, and a couple usages of Chunk. Granted these are only ones that are on crates.io, but although this may technically be breaking, I can't see it having an impact or even affecting any code bases.
I've since had another user report an instance of an app having a version with spaces in it: 7.3 2022-02-28 r5338 (sf-7.3-1) - https://github.com/russellbanks/Komac/issues/447#issuecomment-1952944709
Thanks for the follow-up. I do intend to solve this as soon as I'm able.
Hello, I'm running into this as well. I'm using Versioning to compare PostgreSQL version numbers, to determine if an upgrade is needed.
The output looks a bit like this:
$ psql postgres --pset format=csv -c 'SHOW server_version'
server_version
14.12 (Homebrew)
Here, Homebrew is a suffix that can basically be disregarded. (I probably wouldn't want 14.12 (Homebrew) to compare greater or less than 14.12 (NixOS), for instance.)
I can work around this by removing the suffix, but I thought I'd make a comment here.
It might be nice to have a "permissive parser" which ignores or otherwise segments out extra information like this.
Hi there. In this case I'd recommend a preemptive call to https://doc.rust-lang.org/std/primitive.str.html#method.split_once to remove anything past the first whitespace, and then call Versioning::new.