SOEM icon indicating copy to clipboard operation
SOEM copied to clipboard

SDO read fails on retries

Open open04 opened this issue 1 year ago • 3 comments

I created separate function for reading / writing SDOs in which I added retries. Here's a snippet of my code:

    while(0 < n_retries)
    {
      if(ec_SDOread(slv_num, idx_num, subidx_num, FALSE, &size, &value, EC_TIMEOUTRXM) > 0)
      {
        printf(" Success");
        break;
      }
      n_retries_--;
      printf(" Retry");
    }

Most of the time, ec_SDOread is success (no need for retries), but there are times where it fails (It retries to SDOread but always fails, I tested to make n_retries value to 200 and it still fails. Also, increasing the timeout doesnt help either).

Here is the log on Wireshark: (start is 7620th frame) usdo_ret200_.zip

Now, my solution is to have delay every retries:

    while(0 < n_retries)
    {
      if(ec_SDOread(slv_num, idx_num, subidx_num, FALSE, &size, &value, EC_TIMEOUTRXM) > 0)
      {
        printf(" Success");
        break;
      }
      n_retries_--;
      osal_usleep(10000);
      printf("Retry");
    }

which works well, and I dont know why. Does SDO also needs synchronization?

open04 avatar Jun 18 '24 08:06 open04

I also check on ecx_SDOread, print the workcount return of ec_mbxreceive and ecx_mbxsend when error is encountered.

int ecx_SDOread(ecx_contextt *context, uint16 slave, uint16 index, uint8 subindex,
               boolean CA, int *psize, void *p, int timeout)
{
   ec_SDOt *SDOp, *aSDOp;
   uint16 bytesize, Framedatasize;
   int wkc;
   int32 SDOlen;
   uint8 *bp;
   uint8 *hp;
   ec_mbxbuft MbxIn, MbxOut;
   uint8 cnt, toggle;
   boolean NotLast;

   ec_clearmbx(&MbxIn);
   /* Empty slave out mailbox if something is in. Timeout set to 0 */
   wkc = ecx_mbxreceive(context, slave, (ec_mbxbuft *)&MbxIn, 0);
   printf("1 %d \n", wkc);

   ec_clearmbx(&MbxOut);
   (...)
   /* send CoE request to slave */
   wkc = ecx_mbxsend(context, slave, (ec_mbxbuft *)&MbxOut, EC_TIMEOUTTXM);
   printf("2 %d \n", wkc);

   if (wkc > 0) /* succeeded to place mailbox in slave ? */
   {
      /* clean mailboxbuffer */
      ec_clearmbx(&MbxIn);
      /* read slave response */
      wkc = ecx_mbxreceive(context, slave, (ec_mbxbuft *)&MbxIn, timeout);
      printf("3 %d \n", wkc);
      (...)

Here is the result

seq    wkc
1         -5
2          1
3         -2

1         -2
2          0

1         -2
2          0

1         -2
2          0

1         -2
2          0

1         -2
2          0

Iam not familiar with memset but is it possible that ec_clearmbx(&MbxIn); is not working properly at this point?

with my solution above (adding delay every after retries), I also tried to put the delay on ecx_SDOread() before ec_clearmbxand it also works.

int ecx_SDOread(ecx_contextt *context, uint16 slave, uint16 index, uint8 subindex,
               boolean CA, int *psize, void *p, int timeout)
{
   ec_SDOt *SDOp, *aSDOp;
   uint16 bytesize, Framedatasize;
   int wkc;
   int32 SDOlen;
   uint8 *bp;
   uint8 *hp;
   ec_mbxbuft MbxIn, MbxOut;
   uint8 cnt, toggle;
   boolean NotLast;

   osal_usleep(10000);
   ec_clearmbx(&MbxIn);
   /* Empty slave out mailbox if something is in. Timeout set to 0 */
  (...)

open04 avatar Jun 19 '24 05:06 open04

Upon further investigation, this happens because my processdata cycle delay is 1000us which is much faster than the on the simple_test 5000us. Still, the fix for this problem is to have a delay before the next SDOread / write.

open04 avatar Jun 23 '24 02:06 open04

You just try to get response from the slave faster than it can handle. This is slave dependent.

A proper slave firmware implementation should not have these issues. The slave looses its proper mailbox state and is then stuck.

Slowing down the requests as you have done above does help. But it does not solve the inherent issue in the slave.

ArthurKetels avatar Jun 29 '24 19:06 ArthurKetels

Thanks Arthur, the delay should do the trick by now.

open04 avatar Jul 06 '24 14:07 open04