feat: fix regression test, add a new CI workflow
what
This PR fixes the last broken regression test and adds the whole regression test workflow to GH CI.
why
The regression tests were almost finished, now I found another broken test. After fixed that I prepared a new workflow for GH CI, where we run these regression tests.
I switched the operation system to Ubuntu 24.04 (from 22.04). In Ubuntu 24.04 the old PCRE (libpcre3) is not installed by default, I had to add it explicitly.
I tried to follow the build matrix as it's in other workflows, but a complete regression test takes about 29 minutes - I think that's very slow. On my build machine (which is not a strong one) it takes about 6-7 minutes.
Unfortunately we can't fix this, the reason is that the operation of a test: each tests creates a config, start an Apache server with that config, send a request and check the output (files, response, etc).
I don't know if it's worth it.
references
#3425
I looked at run-regression-tests.pl.in quickly and it seems to me that match_log might be the issue. Log flushing can differ significantly between systems, so it's possible that searching the log in each test takes a long time. I suggest you add some debug print statements to the Perl script so that you can see in the CI output which operations use a lot of time.
I tried to review the script again, and fixed two small things in bbe5239:
- switched to
select()fromsleep() - clear the buffer which hold the read log
Also I removed the test regression/misc/10-tfn-cache.t, because it took many time (on my machine the whole test file was 1m 6secs). The reason is that it seems SecCacheTransformations directive is deprecated.
Until I write this comment, the tests has started, but it seems that no performance improvement. (And beside of that there are two failed tests...)
Okay, I think the reason is visible now but unknown: see the output of the last test. The Apache stop time is terrible slow.
And I don't know why...
@theseion, @dune73 do you have any idea?
If I remember correctly, Apache waits for connections to be closed or whatever. I do not think it sends a RST or something, but it really waits. So if it's slow to stop, I would look at the connection table during the stopping.
Stopping Apache in production is always slow. In a test situation that's not ideal. Maybe you want to kill it instead.
Take a look at my script https://www.netnea.com/files/apachex that I use for control of Apache in my lab. It's a very rough tool, more a mallet than a screwdriver, but it never fails me to kill and restore apache instantly.
Not sure this is the right path, but that's what I have in mind.
There are probably two things you can do:
- Test request should be sent with
Connection: close - Kill httpd
The first is just a nice-to-have of course. I'd absolutely simply kill httpd. There's no reason for handling open connections gracefully in this scenario.
Thanks for ideas,
1. Test request should be sent with `Connection: close`
All requests send Connection: close header (I checked with tcpdump), eg:
10:55:19.058098 IP localhost.omniorb > localhost.33008: Flags [P.], seq 1:225, ack 144, win 512, options [nop,nop,TS val 2093795869 ecr 2093795869], length 224
E...~d@.@..}............J..................
|...|...HTTP/1.1 200 OK
Date: Sat, 08 Nov 2025 09:55:19 GMT
Server: Apache/2.4.58 (Ubuntu)
Last-Modified: Sat, 08 Nov 2025 09:46:10 GMT
ETag: "5-643122b0cc4e7"
Accept-Ranges: bytes
Content-Length: 5
Connection: close
TEST
2. Kill httpd
I tried, but then many other weird issues came out. Eg. when httpd_start() function is called, eg:
Loaded 2 tests from /home/airween/src/modsecurity-apache/tests/regression/config/00-load-modsec.t
1) config - module loaded: passed
Httpd start failed with error messages:
httpd (pid 10748) already running
The first is just a nice-to-have of course. I'd absolutely simply kill httpd. There's no reason for handling open connections gracefully in this scenario.
I'm afraid I have to reorganize the complete start/stop mechanism.
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
I'm afraid I have to reorganize the complete start/stop mechanism.
How are you killing the process(es)? It looks like you did kill and then start_httpd() immediately, so your requests start being processed before the old process has been killed. Maybe you need to put a waitpid in there.
I'm afraid I have to reorganize the complete start/stop mechanism.
How are you killing the process(es)?
here is the current routine which stops httpd.
This section builds and runs the command:
my @p = (
$HTTPD,
-d => $opt{S},
-f => $opt{C},
(map { (-c => $_) } ("Listen localhost:$opt{p}", @_)),
-k => "stop",
);
my $httpd_out;
my $httpd_pid = open3(undef, $httpd_out, undef, @p) or quit(1);
@p holds the command and open3() runs that. It's approximately httpd -k stop.
It looks like you did
killand thenstart_httpd()immediately, so your requests start being processed before the old process has been killed. Maybe you need to put awaitpidin there.
There is a waitipd() already..
Edit: it seems like the waitpid() waits for terminating of httpd -k stop command. That takes a lot of time.
Edit: it seems like the waitpid() waits for terminating of httpd -k stop command. That takes a lot of time.
Yes, exactly. I was thinking of kill + waitpid (if necessary).