pyxcp icon indicating copy to clipboard operation
pyxcp copied to clipboard

Error Handling in pyxcp, why repeating "forever" ?

Open SaxElectronics opened this issue 8 months ago • 6 comments

Hi all,

I am using pyxcp successfully to communicate with a slave, so I have many things up and running, calibration, polling and static DAQ lists handling. I write this just as a preinformation to my question. The question is about the error handling,

When the XCP slave does not respond for any reason, then for many command pyxcp repeats this forever and ever. Why is this defined like this? This blocks the port and each time i have to "kill" the python application to reconnect again.

I was debugging a bit and saw the repeater, ti says

class Repeater: """A required action of some XCP errorhandler is repetition.

Parameters
----------
    initial_value: int
        The actual values are predetermined by XCP:
            - REPEAT (one time)
            - REPEAT_2_TIMES (two times)
            - REPEAT_INF_TIMES ("forever")
"""

REPEAT = 1
REPEAT_2_TIMES = 2
INFINITE = -1

What means it is predetermined by XCP? Is this hard defined in the standard that for example for master.connect(), this command should be repeated "forever" ?

Is there any configuration how to define for which command how many times to repeat it?

In my case calling a command "forever" without receiving anything lands in a dead lock.

Can I configure the error handling (retry) through a configuration or do i have to patch the code?

SaxElectronics avatar May 04 '25 19:05 SaxElectronics

@SaxElectronics

Yes, repeat forever is indeed part of the standard, e.g.:

Image

At least for CONNECT there is currently a parameter (would not to difficult to make it a general parameter): c.General.connect_retries = 3

Regarding static DAQ lists: The DAQ list functionality is currently dynamic DAQ only, but I think it should be sufficient for static DAQ lists to skip the allocation procedure, let me know if should add a patch.

christoph2 avatar May 05 '25 08:05 christoph2

@christoph2 Thanks for the insight! My current problem is/was, that in case of unplugging/disconnecting, the driver keeps on requesting forever or in case the slave has a malfunctioning xcp driver code and does not answer, the xcp master keeps on reconnecting forever. This does not make a lot of sense in my use case and it also blocked the python execution. A reconnection makes sence (at least for me) only when triggered manually by the user. I have disabled completely the retry mechanisms and it is working fine now.

Regarding the static DAQ list: it is all fine now and working for me because i have some slight patches of pyxcp which enable static daq list handling. On top of pyxcp Master class i have included own xcp driver class which interacts with the master (handles the low level xcp processing). I had to add the decoding and sorting and it works fine now. Maybe at some point i will transition fully to the latest updates in pyxcp but there is a learning effort to overcome first.

Reasons I had to implement the daq list handling on my own: 1) the feature was not available back then and 2) I define the daq lists and measurements in slightly different way, also static daq lists have to track the fill precentage etc..

Example of a test log with static daq list handling and some test variables:

2025-05-05 11:40:02 [ INFO] Successfully popped 200 values from Inv_CtrlLoopTestVarUint8 buffer. (daqlist.py:488) 2025-05-05 11:40:02 [ INFO] Retrieved values for Inv_CtrlLoopTestVarUint8: [191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191, 191] (test_xcp.py:570)

SaxElectronics avatar May 05 '25 09:05 SaxElectronics

@christoph2 I have another observation, where you can maybe help me and save me a lot of debugging time. The xcp master is hanging when I try result = master.connect() result = master.disconnect() result = master.connect()

During the second connect i do not get any response(but i think the slave does not receive the command, so it is not the issue in the slave)

Here is the error Log: DEBUG:pyxcp.transport.Base:Python-CAN driver: slcan - unknown] DEBUG:pyxcp.transport.Base:CONNECT DEBUG:pyxcp.transport.Base:-> [ff 00 00 00 00 00 00 00] DEBUG:pyxcp.transport.Base:<- L8 C0 [ff 05 c0 08 08 00 01 01] DEBUG:pyxcp.transport.Base:DISCONNECT DEBUG:pyxcp.transport.Base:-> [fe 00 00 00 00 00 00 00] DEBUG:pyxcp.transport.Base:<- L8 C1 [ff 00 00 00 00 00 00 00] DEBUG:pyxcp.transport.Base:CONNECT DEBUG:pyxcp.transport.Base:-> [ff 00 00 00 00 00 00 00] ERROR:pyxcp.errorhandler:XcpTimeoutError [Response timed out (timeout=3.0s)] WARNING:pyxcp.errorhandler:This part is patched manually to deactivate the repeater.

SaxElectronics avatar May 06 '25 07:05 SaxElectronics

@SaxElectronics This seems a bit tricky -- currently connect() does not only a logical XCP connection but also the transport-layer connection procedure, while disconnect() only shuts down the logical connection, and then there is confusion at some point...

This is a small design issue, I'll fix it as soon as possible.

christoph2 avatar May 06 '25 12:05 christoph2

@SaxElectronics OK, done. Transport-layer connection is now part of the context-manager:

with ap.run() as x:  # transport-layer connection happens here.
    x.connect()
    x.disconnect()

    x.connect()
    x.disconnect()

christoph2 avatar May 06 '25 13:05 christoph2

thanks, will try it out and give feedback!

SaxElectronics avatar May 06 '25 15:05 SaxElectronics