Regex in setvar variables
Context: Currently, (unescaped) backslashes are forbidden in variables content. In msre_parse_generic():
if (*p == '\\') {
if ((*(p + 1) == '\0') || ((*(p + 1) != '\'') && (*(p + 1) != '\\'))) {
[error & return]
}
p++;
The only case a backslash is accepted is when it escapes a single quote or a backslash.
Problem: You cannot store a regex in a variable: setvar:'tx.var=\babc\b' When doing this manually, you can escape the string obviously: setvar:'tx.var=\babc\b' But when you use a macro like the following:
<Macro MyMacro $regex>
SecAction ...,setver:'var=$regex',...
SecRule ARG "$regex" ...
</Macro>
there's no way to call the macro with a parameter compatible with both directives
Solution: We could be lax (and still compatible with the current behaviour):
- If it's followed by a quote or a backslash, accept and "eats" the escaping backslash
- In all other cases, accept and don't "eat" the backslash
- the code change is trivial
Does somebody see any other solution?
Hi @marcstern ,
There are a couple of ways to work around that current limitation:
-
It is often possible to rework a regular expression by using posix character classes instead of backslash specifications. For example, I believe your sample
setvar:'tx.var=\babc\b'could be specfied with(^|[^[:word:]])abc([^[:word:]]|$). -
One technique I have used a couple of times in the past is to use a lua script to populate the variable instead of directly via setvar. That way you aren't restricted by some of the limitations in the rule syntax.
Hi @martinhsv,
- Indeed, some escape sequences could be replaced by posix character classes, but I'm pretty sure we can find some that cannot.
- Using LUA is only possible if LUA support is enabled, which isn't the case on high security environments, like military ones.
So my propositionis still valid. As I said, it's fully compatible with the existing behaviour.
Hi @marcstern ,
Whether there is a fully-functional workaround or not, a change can always be considered.
But I'm not entirely clear on what you are proposing. Could you describe it more fully? What do mean by "eats"? At what stage are you proposing that that happen?
Hi @martinhsv ,
Here is my proposal:
-
replace
if ((*(p + 1) == '\0') || ((*(p + 1) != '\'') && (*(p + 1) != '\\'))) {byif ((*(p + 1) == '\0')) {To allow anything after a backslash -
replace the line
p++;that "eats" (deletes) the backslash byif ((*(p + 1) == '\'') || (*(p + 1) == '\\')) p++;So it only deletes the backslash when preceding a single quote or a backslash (same behaviour as now)