Performance Bug: RTKBase Web Service Creates D-Bus Message Flood (400+ messages/sec)
Summary
The RTKBase web service (rtkbase_web.service) creates an excessive D-Bus message flood that can overwhelm system resources on Raspberry Pi devices, causing high CPU usage and system instability.
Environment
Device: Raspberry Pi 2 Model B OS: Raspberry Pi OS (Debian 12 bookworm) RTKBase Version: 2.6.3 Python Version: 3.11.2 D-Bus Message Bus Daemon Version: 1.14.10 Affected Component: web_app/server.py and web_app/ServiceController.py
Problem Description
The RTKBase web application polls systemd service status every second using pystemd, creating thousands of new D-Bus connections per second. This causes:
1,000-3,000+ actual D-Bus messages per second (normal is 10-50/sec) 15,000-20,000+ lines of D-Bus output per second (due to multi-line message structure) High CPU usage from dbus-daemon (40%+) and polkitd (10%+) System performance degradation on resource-constrained devices Potential system instability due to D-Bus bus saturation
Evidence
Quantitative Analysis
sudo timeout 10 dbus-monitor --system | grep -E '^(method call|method return|signal|error)' | wc -l
Qualitative Analysis The D-Bus flood shows a clear pattern:
- Rapid connection creation: Connection IDs increment from :1.4533994 to :1.4578224 in seconds
- Repetitive service queries: Same systemd units queried continuously
- UnitNew/UnitRemoved cycles: Services created and immediately destroyed
The attached script dbus_analysis.sh runs a comprehensive analysis.
System Impact
# htop showing high CPU usage
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
358 messagebus 20 0 8608 3936 3424 S 41.6 0.4 19:41.25 dbus-daemon --system
366 polkitd 20 0 48120 7624 6512 S 11.7 0.7 4:50.54 polkitd --no-debug
Root Cause Analysis
Primary Issue: Inefficient D-Bus Usage in ServiceController.py The ServiceController class creates new D-Bus connections for every query:
# Current problematic code
class ServiceController(object):
def __init__(self, unit):
self.unit = Unit(bytes(unit, 'utf-8'), _autoload=True) # New connection each time
def isActive(self):
# Creates new D-Bus connection
if self.unit.Unit.ActiveState == b'active':
return True
Secondary Issue: Excessive Polling in server.py The manager() function polls services every second:
# Current problematic code in manager()
while True:
if connected_clients > 0:
updated_services_status = getServicesStatus(emit_pingback=False) # Every 1 second
# ...
time.sleep(1)
Why This Creates a D-Bus Flood
- 11 services × 3 D-Bus calls per service (isActive, status, get_result) × every second = 33+ new connections/sec
- pystemd library doesn't reuse connections efficiently
- Each connection requires multiple D-Bus messages (Hello, NameAcquired, property queries, NameLost)
- Broken / misconfigured services cause additional UnitNew/UnitRemoved cycles
Proposed Solution
1. Implement Connection Reuse and Caching in ServiceController.py Replace the current ServiceController.py with an improved version that:
- Reuses D-Bus connections via shared Manager instance
- Caches service status for 5 seconds to eliminate redundant queries
- Handles errors gracefully with safe defaults
2. Reduce Polling Frequency in server.py Modify the manager() function to:
- Check services every 10 seconds instead of every second
- Send system info every 2 seconds (separate from service checks)
- Clear cache after service state changes
3. Add Better Error Handling
- Graceful fallbacks for problematic services
- Proper exception handling in service queries
- Safe defaults when services are unavailable
Files to Modify:
- web_app/ServiceController.py - Complete rewrite with connection pooling
- web_app/server.py - Update manager() and getServicesStatus() functions
Will create a corresponding pull request.
Thank you for this in depth report.
PR is welcome.
2. Reduce Polling Frequency in server.py Modify the manager() function to:
* Check services every 10 seconds instead of every second
But, I will keep the service check every second, because I don't want to confuse the end user. And don't forget that these calls with pystemd are stopped when they are no user connected to the web interface.