compile-time-regular-expressions
compile-time-regular-expressions copied to clipboard
URL regexp does not search correctly with the word boundary
I was using CTRE for url finding in the text and apparently found that it does not search everything. Examples
The original regex was R"((?<all>(http://www\.|https://www\.|http://|https://)?[a-zA-Z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(/[^\s]*)?/?\b))"
. I tried to simplify a bit:
Does not compile
static_assert(R"((?:[a-zA-Z0-9]+\.com\/?\b))"_ctre.search("google.com/ some text"));
static_assert(R"((?:[a-zA-Z0-9]+\.com/?\b))"_ctre.search("google.com/ some text"));
Compiles:
static_assert(R"((?:[a-zA-Z0-9]+\/?\b))"_ctre.search("google/ some text"));
static_assert(R"((?:[a-zA-Z0-9]+\.com\/?))"_ctre.search("google.com/ some text"));
static_assert(R"((?:[a-zA-Z0-9]+\.com\/?\b))"_ctre.search("google.com/some text"));
https://gcc.godbolt.org/z/41n1jj
More simplified:
Does not compile
static_assert(R"((?:c/?\b))"_ctre.search("c/ some text"));
static_assert(R"((?:c\/?\b))"_ctre.search("c/ some text"));
Compiles:
static_assert(R"((?:/?\b))"_ctre.search("c/ some text"));
It compiles, so I think there's something wrong with your static assert. https://gcc.godbolt.org/z/166969
Static assert checks if the regular expression matches any substring, you link differs from the examples I give above
R"(c\/?\b)"_ctre.search("google.com/ some text");
There is no letter c followed by / in the search text argument or c with a word boundary.