Discussion about 'hostname' field in log
Describe the bug
Libmodsecurity3 produces log (through a callback function eg. for Nginx) with unusable [hostname] field. [hostname] always contains the IP address of the server, which has no informational value.
Logs and dumps
An example:
ModSecurity: Warning. detected SQLi using libinjection. [file "/usr/share/modsecurity-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "46"] [id "942100"] ... [tag "PCI/6.5.2"] [hostname "18.19.20.21"] [uri "/xmlrpc.php"]... client: 91.92.93.94, server: www.myserver.com, request: "POST /xmlrpc.php HTTP/1.1", host: "www.myserver.com"
As you can see, the log contains the server and host fields at the end of the line, but unfortunately the length of the line is limited (it's hard coded in Nginx to 2048 bytes), and if the request is too long (eg. there is a GET request with some very long query string) these fields will be disappeared.
To Reproduce
Send any invalid request that triggers a rule and produces a log entry.
Expected behavior
As in case of Apache's mod_security2 module, it would be fine to get the hostname field with the correct value.
Server (please complete the following information):
- ModSecurity version (and connector): libmodsecurity3, all versions
Additional context
There is a previous intention to fix this issue:
https://github.com/owasp-modsecurity/ModSecurity/pull/2906
but it was rejected (see the discussion). The first point was this:
The 'Host' header is untrusted user input. - which is a bit interesting, because (meanwhile I reviewed the code again) I found that the variable SERVER_NAME (reference) is also produced from the Host header - see the source - without any filtering.
Never mind, I would like to discuss how can we solve this issue.
My suggestions:
- send that PR again (note: Apache also follows the implemented way)
- forget that PR and the host name, and add a new function to libmodsecurity3's API which provides the possibility to set the host name for the application who embeds it
- for this, I created a separated branch: https://github.com/owasp-modsecurity/ModSecurity/compare/v3/master...airween:ModSecurity:v3/sethostname?expand=1
- a brief summary: this fix provides a new function:
setRequestHostName()(andmsc_set_request_hostname()in C) that sets the inside variable - if the application does not call this, the value of the variable is the same as the server IP (nothing changes)
- if the application calls it, then it can sets the correct hostname
I also created a patch for Nginx connector, here is the result:
curl -v -H "Host: modsecurity.org" http://localhost/?q=/bin/bash
the log:
ModSecurity: Warning. Matched "Operator `ValidateByteRange' ... [tag "capec/1000/210/272"] [hostname "modsecurity.org"] [uri "/"] ... client: ::1, server: _, request: "GET /?q=/bin/bash HTTP/1.1", host: "modsecurity.org"
Note, that the connector uses Nginx's r->headers_in.server variable, see Nginx's source.
Please share your idea about this behavior and possible solutions.
So with your change, the hostname log entry would become usable, but might still be cut off. That will have to be solved later, correct?
Do I understand correctly, that using a directive like server_name would now set the server name for the log?
I think the patch would be fine. As for the alternative approach with the application, would that be the connector that calls it?
So with your change, the
hostnamelog entry would become usable, but might still be cut off. That will have to be solved later, correct?
yes, exactly. Later, and not in the library, but in the application that uses library (namely Nginx connector).
But we have to be careful: if we add this method to Nginx connector, that won't work with older versions, so we have to check that function exists or not. The best was to control this we check the ModSecurity version, like it's used in other case, see this example.
This is why would it be good to add this feature as soon - we can add this feature to the connector after release.
Do I understand correctly, that using a directive like
server_namewould now set the server name for the log?
Yes, meanwhile I realized that too.
Thanks.
I think the patch would be fine. As for the alternative approach with the application, would that be the connector that calls it?
Exactly. With a version control - see my previous comment.
Do I understand correctly, that using a directive like
server_namewould now set the server name for the log?
I was still thinking about this question and made some tests.
First: I wouldn't touch the field names of the log, I would keep the existing fields, namely [hostname]. The reason is simple: the more similar the formats of the logs are to each other (v2/v3), the easier it is to notice the differences based on the logs. And if someone has a log storing/processing system, then more easier to unify the log processing.
Take a look to the Nginx's server log:
[hostname "www.modsecurity.org"] ... [ref ""], client: ::1, server: www.modsecurity.org, request: "GET /?q=/bin/bash HTTP/1.1", host: "modsecurity.org"
I have a vhost in my Nginx with 2 server names:
server_name www.modsecurity.org modsecurity.org;
If I send a request with hostname modsecurity.org, then I get the log entry like above: the server is with the first server_name entry, the host is what I used. (I modified the connector code, now I tried to use module context's server_name variable.
Which one is the better? Apache uses the Host header value, just fyi. I would prefer that one too.
(I know that it is not necessary to decide now and not here - just finished a test and shared my experience.)
I assume, that if you used default_server, nginx will fall back to the Host header as well. To me, knowing which virtual server processed the request is more useful than the contents of the Host header.
I second @theseion, but anything is fine as long it's not an IP address. :)
I mean in most cases, host header and servername should be in sync. So it does not matter too much (obviously, the source needs to be well documented for the remainder of the cases).
I assume, that if you used
default_server, nginx will fall back to theHostheader as well. To me, knowing which virtual server processed the request is more useful than the contents of theHostheader.
No. I already tried that, but in case of default server, the [hostname] field contained the _ character - which is not usable. I suggest we should use r->headers_in.server, which is more usable.
Based on this discussion and on Slack's comments, I'm going to prepare the PR which implements a new API function that can be used to set the correct hostname. Later in application site we can decide what value do we want to set.