rust-bindgen
rust-bindgen copied to clipboard
Sanitize RegexSet input so alternation is properly handled
Resolves #1755 by wrapping the regexes provided to a RegexSet
in a capture group, so the added anchors work properly with alternation.
This doesn't resolve the issue of implicit/hidden regex modifications performed by this crate, which should still be properly documented.
One thing to note: This will cause breakages to crates that make incorrect assumptions about how the regex inputs are handled and do not account for the implicitly-added anchors (such as emscripten-sys
).
Sorry this got stalled, I've added a test case.
This is breaking a couple existing test-cases unfortunately. That needs looking into. Depending on how much of an edge case they are, it may be ok to change their arguments or such.
Those failures are because of *
being provided where a regex is expected. That used to be converted to ^*$
which is accepted, it is now being converted to ^(*)$
which is not accepted.
I did some research/analysis on dependent crates in #1783 , which might be useful for determining how much this breaks. The list of all uses of |
may be of particular interest.
I think changing those tests to use .*
and landing this should be ok, I'll make sure to document this change prominently for the next release if so. what are your thoughts?
Thanks again for doing that research.
I took the liberty of rebasing upon master
at ~~c27578fac5372575e3caa933a83eac931ae3b53f~~ 74b38670b04067598ce488a455725b324bafdaee and fixing up the failing tests. If CI passes, this might be ready for the next major release, since it is a breaking change.
I think changing those tests to use
.*
and landing this should be ok
@emilio Now each commit in this branch independently passes tests. I assume this should wait until v0.61.0, whenever that happens, since it is a breaking change, but I thought I would make it clear I think the PR is ready.
@kulp. Do you think it would make sense to emit a warning if the user still uses "*"
explaining this change?
@kulp. Do you think it would make sense to emit a warning if the user still uses
"*"
explaining this change?
@pvdrz that seems like a good idea to me!
:umbrella: The latest upstream changes (presumably #2284) made this pull request unmergeable. Please resolve the merge conflicts.
@kulp @emilio do you think we should wait to merge this?
I took the liberty of rebasing against master again, adding a warning when using wildcard patterns and update the changelog. Do you think we can merge this before the next release? @emilio @kulp
I took the liberty of rebasing against master again, adding a warning when using wildcard patterns and update the changelog. Do you think we can merge this before the next release? @emilio @kulp
@pvdrz thanks for rebasing. It looks like I missed v0.61.0.
Maybe I was being paranoid though, and we could merge this any time, since although it is a breaking change, that becomes visible only the next time we do a release. So maybe there is nothing that should hold this up any longer?
I will leave it to your collective judgment since I cannot spend more cycles on it in the near term.
Maybe I was being paranoid though, and we could merge this any time, since although it is a breaking change, that becomes visible only the next time we do a release. So maybe there is nothing that should hold this up any longer?
Oh, now I remember; I was thinking it needed to be ensured not to be released in a point-release like 0.60.5, but instead in a semver-ish breaking release like 0.61.0.
Probably this is putting the cart before the horse, and instead we should just merge this, and then make sure the next release bumps the correct version number.
I'm going to bump versions in this PR and merge it