SOEM icon indicating copy to clipboard operation
SOEM copied to clipboard

Data race with `rxbufstat` in several nic drivers

Open smarvonohr opened this issue 5 months ago • 5 comments

I was testing my application with GCC ThreadSanizier on linux and I got several warnings from the SOEM code (see below). Looking at the code there are several places in the nic drivers, where rxbufstat is accessed, but no real locking seems to be included for this variable. In the linux nicdrv.c I've identified the following places where this variable is accessed (ignoring initialization routines):

  1. ecx_getindex: Access is protected by getindex_mutex
  2. ecx_setbufstat: No locking
  3. ecx_outframe: No locking
  4. ecx_outframe_red: Access is protected by tx_mutex
  5. ecx_inframe: Partially protected by rx_mutex

As you can see, there is either is no locking at all, or if mutexes are used they are not the same across these functions. As I understand it should be possible to use the SOEM library to simultaneous exchange process data and access SDOs. In this scenario these functions may be called in parallel, so they should be thread safe. To make this code thread safe a lot of lock/unlock blocks around the rxbufstat variable are needed. So, I’m not sure if this is the best way to fix this. Possibly an easier way would be to use atomics for this state.

Example ThreadSanitizer output:

  Write of size 4 at 0x7bd000005fe8 by thread T15:
    #0 ecx_inframe /workdir/SOEM/oshw/linux/nicdrv.c:400 (libsoem.so+0x2902c)
    #1 ecx_waitinframe_red /workdir/SOEM/oshw/linux/nicdrv.c:481 (libsoem.so+0x294a0)
    #2 ecx_waitinframe /workdir/SOEM/oshw/linux/nicdrv.c:556 (libsoem.so+0x298d4)
    #3 ecx_receive_processdata_group /workdir/SOEM/soem/ethercatmain.c:1935 (libsoem.so+0x2436f)

  Previous write of size 4 at 0x7bd000005fe8 by thread T14 (mutexes: write M0, write M1, write M2):
    #0 ecx_inframe /workdir/SOEM/oshw/linux/nicdrv.c:437 (libsoem.so+0x29388)
    #1 ecx_waitinframe_red /workdir/SOEM/oshw/linux/nicdrv.c:481 (libsoem.so+0x294a0)
    #2 ecx_srconfirm /workdir/SOEM/oshw/linux/nicdrv.c:593 (libsoem.so+0x29998)
    #3 ecx_FPRD /workdir/SOEM/soem/ethercatbase.c:325 (libsoem.so+0x7123)
    #4 ecx_mbxreceive /workdir/SOEM/soem/ethercatmain.c:1029 (libsoem.so+0x2110d)
    #5 ecx_SDOwrite /workdir/SOEM/soem/ethercatcoe.c:371 (libsoem.so+0x955f)

smarvonohr avatar Sep 05 '24 13:09 smarvonohr