EVS_GenerateEventTelemetry has a race condition
Describe the bug In the function EVS_GenerateEventTelemetry, CFE_EVS_Global.EVS_TlmPkt.Payload.MessageSendCounter is read, and then modified, in a manner which allows a race condition to increment twice and roll-over, when the desired effect is saturation.
To Reproduce Steps to reproduce the behavior:
- Go to modules/evs/fsw/src/cfe_evs_utils.c, line 520 in function EVS_GenerateEventTelemetry
- Observe that the variable is read to compare against CFE_EVS_MAX_EVENT_SEND_COUNT.
- Observe that after the comparison, a write is made, but that a competing thread may have invalidated the prior condition.
- Run cFS in a multithreaded environment under helgrind repeatedly to manifest the behavior.
Expected behavior Some kind of atomic compare and swap, or mutex protection.
Code snips
/* Increment message send counters (prevent rollover) */
if (CFE_EVS_Global.EVS_TlmPkt.Payload.MessageSendCounter < CFE_EVS_MAX_EVENT_SEND_COUNT)
{
CFE_EVS_Global.EVS_TlmPkt.Payload.MessageSendCounter++;
}
System observed on:
- Hardware
- OS: Fedora 39, amd64, kernel version 6.7.4
- Versions: CFE draco-rc5, main (as of filing this issue on 29 February, 2024)
Additional context No additional context.
Reporter Info Dominick Allen, Odyssey Space Research contractor to NASA JSC.
@dallen-osr Good catch and thanks for pointing this out! Was this caught through manual inspection or did you use a tool to catch the error? If a tool was used...where other errors detected?
I caught it with a tool - I don't remember if it was tsan (thread sanitizer) or one of valgrind's tools, but I was combing through with both of them. They are incompatible with each other, so you have to use one or the other at a time.