trafficserver
trafficserver copied to clipboard
Fix hosting.config reload
This is an alternative to https://github.com/apache/trafficserver/pull/8664. That PR is fine in-itself and fixes a bug around the the atomic swap not actually reloading. But fixing that bug so that reloading hosting.config works exposes another deeper bug: the reloaded object has no synchronization around it.
Not only is the pointer itself never read with synchronization, even if it were, the object itself is never synchronized. So even if the pointer were atomic, the object reads could get different variables from different objects, when callers that need multiple variables really need them to come from a single object. Worse, if a request takes longer than the arbitrary delete timer, it could access freed memory.
This fixes the object to be synchronized, and all access behind a mutex. The reload swap and bad drive removal acquire a write lock, and all other accesses (none of which modify the object) acquire a read lock.
I really don't like adding a mutex to the request path. But the only other options I see are an incredibly complex and bug-prone lock-free solution, or not making it possible to reload hosting.config. As much as I dislike it, the mutex should never have contention except on reload or drive failure. So it's probably actually faster and fewer atomic instructions than even a lock-free solution.
This also adds some helper classes, to make it impossible to accidentally forget to acquire or release the mutex, or modify the object with a readlock, via RAII idioms.
Recommend backporting to 9.2.
Fixes #7220
[approve ci docs]
[approve ci autest]
[approve ci autest]
Cherry-picked to v9.2.x