ModSecurity icon indicating copy to clipboard operation
ModSecurity copied to clipboard

Regex in setvar variables

Open marcstern opened this issue 2 years ago • 4 comments

Context: Currently, (unescaped) backslashes are forbidden in variables content. In msre_parse_generic():

if (*p == '\\') {
   if ((*(p + 1) == '\0') || ((*(p + 1) != '\'') && (*(p + 1) != '\\'))) {
      [error & return]
   }
   p++;

The only case a backslash is accepted is when it escapes a single quote or a backslash.

Problem: You cannot store a regex in a variable: setvar:'tx.var=\babc\b' When doing this manually, you can escape the string obviously: setvar:'tx.var=\babc\b' But when you use a macro like the following:

<Macro MyMacro $regex>
 SecAction ...,setver:'var=$regex',...
 SecRule ARG "$regex" ...
</Macro>

there's no way to call the macro with a parameter compatible with both directives

Solution: We could be lax (and still compatible with the current behaviour):

  • If it's followed by a quote or a backslash, accept and "eats" the escaping backslash
  • In all other cases, accept and don't "eat" the backslash
  • the code change is trivial

Does somebody see any other solution?

marcstern avatar Jul 12 '23 10:07 marcstern

Hi @marcstern ,

There are a couple of ways to work around that current limitation:

  1. It is often possible to rework a regular expression by using posix character classes instead of backslash specifications. For example, I believe your sample setvar:'tx.var=\babc\b' could be specfied with (^|[^[:word:]])abc([^[:word:]]|$) .

  2. One technique I have used a couple of times in the past is to use a lua script to populate the variable instead of directly via setvar. That way you aren't restricted by some of the limitations in the rule syntax.

martinhsv avatar Jul 12 '23 17:07 martinhsv

Hi @martinhsv,

  1. Indeed, some escape sequences could be replaced by posix character classes, but I'm pretty sure we can find some that cannot.
  2. Using LUA is only possible if LUA support is enabled, which isn't the case on high security environments, like military ones.

So my propositionis still valid. As I said, it's fully compatible with the existing behaviour.

marcstern avatar Jul 20 '23 07:07 marcstern

Hi @marcstern ,

Whether there is a fully-functional workaround or not, a change can always be considered.

But I'm not entirely clear on what you are proposing. Could you describe it more fully? What do mean by "eats"? At what stage are you proposing that that happen?

martinhsv avatar Jul 20 '23 17:07 martinhsv

Hi @martinhsv ,

Here is my proposal:

  1. replace if ((*(p + 1) == '\0') || ((*(p + 1) != '\'') && (*(p + 1) != '\\'))) { by if ((*(p + 1) == '\0')) { To allow anything after a backslash

  2. replace the line p++; that "eats" (deletes) the backslash by if ((*(p + 1) == '\'') || (*(p + 1) == '\\')) p++; So it only deletes the backslash when preceding a single quote or a backslash (same behaviour as now)

marcstern avatar Jul 24 '23 15:07 marcstern