apachetop Apachetop output headers garbled on busy server

We run a public OSS server that is quite busy. We are using the standard combined log format:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined

When I run apachetop -f /var/log/httpd/access_log I get weird garbled output headers:

last hit: 01:51:00         atop runtime:  0 days, 00:14:10             01:51:01
All: 0 reqs 49209.0/sec) 057.9(    0.0B/s   90.7G (  109.2M         1931.9K
2xx:   41828 (85.0%) 3xx:       7 ( 0.0%) 4xx:  7374 (15.0%) 5xx:     0 ( 0.0%)
R ( 30s): 0 r1768(   0.0/s58.90.0B (    0.5220.1M (  174.0M         3023.4K
2xx:    1521 (86.0%) 3xx:       0 ( 0.0%) 4xx:   247 (14.0%) 5xx:     0 ( 0.0%)

 REQS REQ/S    KB KB/S URL
   88  2.93  16.8  0.6*/centos/7.9.2009/updates/x86_64/repodata/repomd.xml
   80  2.67 383.8 12.8 /rocky/8.10/AppStream/x86_64/os/repodata/repomd.xml
   67  2.23 207.2  6.9 /rocky/8.10/extras/x86_64/os/repodata/repomd.xml
   63  2.10 256.2  8.5 /rocky/9.6/BaseOS/x86_64/os/repodata/repomd.xml
   63  2.10 283.3  9.4 /rocky/9.6/AppStream/x86_64/os/repodata/repomd.xml
   61  2.03 264.8  8.8 /rocky/8.10/BaseOS/x86_64/os/repodata/repomd.xml
   52  1.73  10.0  0.3 /centos/7.9.2009/os/x86_64/repodata/repomd.xml
   44  1.47   8.4  0.3 /centos/7.9.2009/extras/x86_64/repodata/repomd.xml

The lists of recent requests appears OK but the headers don't show the correct counts. Any guidance on how to troubleshoot this further?

Jun 12 '25 01:06 scottchiefbaker

Hmm, I haven't looked at the code for quite some time (I'm the current maintainer, not the original dev), but I do remember that there are hardcoded values (e.g. the ring buffer). I have to check it out. Maybe I get to it next week.

I suspect you have so many transactions that they don't fit into the buffer.

Jun 13 '25 16:06 tessus

If there is something I can to test or troubleshoot I'm happy to help.

Jun 13 '25 16:06 scottchiefbaker

I think there is also debug output in /tmp. I can't recall whether it is written by default or not. I really have to go through the code and familiarize myself with it again. I haven't touched it for quite some time.

I might need this debug file from your machine. I'll let you know how to collect it. Maybe my memory is failing me and there's no debug. I am sorry, but if you don't work with a code base all the time, it's hard to not mix things up with other projects.

Btw, interesting that apachetop is still used. After goaccess was created I thought people would switch over without hesitation.

Jun 13 '25 16:06 tessus

I wasn't aware of goaccess so I just checked it out. It's great, has lots of cool reports, but it doesn't have the "last 30 seconds" that Apachetop has.

We're monitoring in real-time the 2xx, 3xx, and 5xx codes. Knowing the total percentage is great, but I really need to know the last 30 seconds so I can catch a problem in real time.

Jun 13 '25 17:06 scottchiefbaker

I really like goaccess, it's in my mind the next evolution of apachetop.

but it doesn't have the "last 30 seconds"

Hmm, I thought they provided real time data access info. But you can always open a feature request with goaccess. Your use case makes perfect sense and if I were the author, I'd add it. (I'd add 5, 10, 30, 60s - or at least a few, similar to the load avgs in uptime/top/...)

I'll try to look into apachetop's code base next week. In the mean time, can you please tell me a bit more about your load. How many log lines are created by Apache per second? What would be the correct counts for the header section, do you know?

We're monitoring in real-time the 2xx, 3xx, and 5xx codes. Knowing the total percentage is great, but I really need to know the last 30 seconds so I can catch a problem in real time.

TBH, having used a lot of o11y the past decade, I'd seriously consider a different approach. apachetop requires you to monitor interactively and manually. Using something like open telemetry, loki, prometheus, dynatrace, vector, elk (or whatever tickles your fancy) would allow you to automatically monitor your response code, plus react to them accordingly (send notifications, run scripts, spawn a new container/node, ...).

Apachetop and goaccess are probably rather used for ad-hoc monitoring, which is still nice and important at times, but you don't want to pay a person for manually monitoring 24/7. ;-)

Jun 13 '25 17:06 tessus

We're doing about 15 hits per second. Here is a sample of the logs:

199.255.45.4 - - [08/Jun/2025:04:17:59 -0700] "GET /rocky/9.6/BaseOS/x86_64/os/repodata/888528d8-5feb-4d11-99dd-2c7dcea36852-PRIMARY.xml.gz HTTP/1.1" 200 1642450 "-" "libdnf (Rocky Linux 9.3; generic; Linux.x86_64)"
199.255.45.4 - - [08/Jun/2025:04:17:59 -0700] "GET /rocky/9.6/BaseOS/x86_64/os/repodata/7a4bbdb1-d938-4ac2-a5d3-20fd9cf6d73f-FILELISTS.xml.gz HTTP/1.1" 200 690421 "-" "libdnf (Rocky Linux 9.3; generic; Linux.x86_64)"
20.171.207.232 - - [08/Jun/2025:04:17:59 -0700] "GET /rocky/8.10/Devel/aarch64/os/Packages/l/libsrtp-1.5.4-8.el8.aarch64.rpm HTTP/1.1" 200 286480 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)"
34.75.193.86 - - [08/Jun/2025:04:17:59 -0700] "GET /centos/7.9.2009/extras/x86_64/repodata/repomd.xml HTTP/1.1" 404 196 "-" "urlgrabber/3.10 yum/3.4.3"
199.255.45.4 - - [08/Jun/2025:04:17:59 -0700] "GET /rocky/9.6/BaseOS/x86_64/os/repodata/7a4bbdb1-d938-4ac2-a5d3-20fd9cf6d73f-GROUPS.xml HTTP/1.1" 200 289652 "-" "libdnf (Rocky Linux 9.3; generic; Linux.x86_64)"

This should be pretty standard "combined" format.

Jun 13 '25 18:06 scottchiefbaker

I've looked into it, but couldn't find the problem. I'm still on it. I have not forgotten. It's just that during these hot days my brain doesn't work properly.

Jul 17 '25 03:07 tessus

I appreciate you investigating

Jul 17 '25 15:07 scottchiefbaker