edgetx
edgetx copied to clipboard
[X10Express] Serial connection to the MPM is sporadically lost
Is there an existing issue for this problem?
- [X] I have searched the existing issues
What part of EdgeTX is the focus of this bug?
Transmitter firmware
Current Behavior
2 weeks ago I have ETX2.9.2 and my Horus X10S Express with an IRange IRX+ (MPM) connected to a Blade130S and suddenly lost all control. This has happened twice in a short space of time. Luckily nothing happened to the Blade 130S (DSMX, it happily fell out of the air. The red control LED then flashed on the Irange IRX+ module. Normally it lights up permanently. What happened that the serial connection to the MPM can no longer be kept stable? Last weekend I got the same problem again. There is definitely still a problem in the serial control of the MPM. Unfortunately, the problem occurs very sporadically.
I have now carried out further tests on the workbench. I have now connected the MPM to the Horus X10S Express with a Y-cable and connected a logic analyzer to it. Initially, the CPPM signal curve is displayed as normal. When the error occurred almost two days later, I looked at the CPPM signal and it was consistently LOW.
Perhaps this problem only occurs on transmitters that have the fast serial connection for external ACCESS in the ext. module bay?
Expected Behavior
MPM CPPM Signal should be stable
Steps To Reproduce
- Power on X10 Express and select a model with ext.MPM DSMX protocol
- wait for minutes, hours, days....
- suddenly the CPPM signal is lost (low), red LED on irangeX+ Module starts flashing
Version
2.9.2
Transmitter
FrSky X10 Express / X10S Express (ACCESS)
Operating System (OS)
No response
OS Version
No response
Anything else?
may there is a connection to issue #4357
I have just reproduced the problem with a Radiomaster TX16S MKII and the ext. IRangeX+. This occurs in exactly the same way as with the Horus X10 Express and has nothing to do with the serial highspeed for ACCESS.
The only difference is that after a few seconds the CPPM connection was automatically re-established and at that moment the Blade130S (DSMX) started up the rotors. So the whole thing is very dangerous. Addendum: This happened because I had not checked "deactivate Ch. mapping" in the settings.
With the Horus X10 Express, the connection was never re-established automatically.
@raphaelcoeffic what do think about this?
Any chance the issue lies with your MPM module? Since 2.9.2 has been live some time already, and you seem to be the only one with this issue, I wonder
I had also considered this at first, hence the check with the logic analyzer. I think that if it were the MPM, the correct signal should be present at the CPPM and the MPM would have the error. But the fact is that CPPM is completely LOW. In addition, I never had any problems with the module before with ETX2.9.1 and older ETX versions.
There are hardware failures of mpm that could short signal pin, that could look like that. Have you tried putting 2.9.1 back and leave it enough time?
@ParkerEde shot a video of his IRange IRX+ MPM showing a flashing red LED at 0 .5s on/off interval. https://www.multi-module.org/using-the-module/troubleshooting lists this as "slow blink":
Of course it might still be a faulty MPM But I wouldn't bet on it. @ParkerEde: do you have access to another exetrnal MPM?
I will continue to observe and report back. At the moment, however, I would say that the problem should not be coming from the MPM. Also, if the error occurs again, I can pull the CPPM signal line out of the MPM and then measure it on the module bay side. If the line is still LOW, it should be clear that the fault is not in the MPM. Do you see it the same way?
No this the only one.
Now I have had the problem again. I disconnected the CPPM line from the MPM and evaluated it in logicanalyzer. It is completely LOW. The MPM can therefore be ruled out as the cause. After measuring the CPPM line, I plugged it back into the MPM and the LED continued to flash. I left the system in this state. After about 15-20 minutes, the LED was permanently on again and the module worked again. So it is the case that the CPPM line not only drops to LOW sporadically, but also comes back on by itself at some point.
I have now been running the HorusX10S Express with the MPM for hours without powering on the Blade130S (DSMX). Then the error does not occur. I am now running the system with the Blade130S powered on but have not connected the S.PORT line to the MPM. I assume that the error will then not occur.
In the case that the S.PORT line is not connected to the MPM, the connection is completely maintained and everything is OK. I can now say with certainty that the telemetry transmitted to ETX via S.PORT causes the CPPM line to drop to LOW at some point (in my case always between 1 minute and 2 hours).
Something seems to be going wrong with the telemetry in ETX so that the entire CPPM connection subsequently breaks down.
These tests are all based on a Spektrum DSMX connection to a Blade130S.
And finally, here is some additional information. If the S.Port line is connected, but I deactivate telemetry in the MPM settings, the error does not occur. This means that the activated telemetry must be causing something in ETX to go wrong, which results in CPPM changing to LOW level.
I have now tested an mpm from a friend. It behaves exactly like my own
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
yes I will try. But this will be a hard job.
Here you will find a recording of CPPM (D0) and S.PORT (D1). At the end you can see where CPPM goes low. Since there is definitely a dependency of the error on the activated telemetry, I think it can help to see what data was sent on the S.PORT before. I used Sigrok pulseview for the recording. Spektrum-CPPM-SPORT-CPPMlow.zip
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
With v2.8.5 the error has not yet occurred for me. I have rebuilt v2.9.0 and v2.9.1 via GITpod for X10Express. The error occurs within a few seconds.
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
With v2.8.5 the error has not yet occurred for me. I have rebuilt v2.9.0 and v2.9.1 via GITpod for X10Express. The error occurs within a few seconds.
So it happens in 2.9.0 as well, right?
Interesting, I used to fly using my TX16S + MPM + SPM4649T with battery voltage telemetry using 2.9.0 firmware for some time and did not observer any problems. What model of receiver you are using?
I tried on 2.9.2 and then latest nightly with TX16S mk2 + external iRangeX MPM, about 30 minutes both times, both telemetry sensors and module telemetry were fine. DSMX 22. Will try again to see if was a fluke. Did have to DFU flash the module first is it was non-responsive so must have screwed up its firmware at some earlier stage.
On Mon, 18 Dec 2023, 9:19 am richardclli, @.***> wrote:
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
With v2.8.5 the error has not yet occurred for me. I have rebuilt v2.9.0 and v2.9.1 via GITpod for X10Express. The error occurs within a few seconds.
So it happens in 2.9.0 as well, right?
Interesting, I used to fly using my TX16S + MPM + SPM4649T with battery voltage telemetry using 2.9.0 firmware for some time and did not observer any problems.
— Reply to this email directly, view it on GitHub https://github.com/EdgeTX/edgetx/issues/4411#issuecomment-1859319800, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJ66KOTOO3TMUHSHIKRX6DYJ54ZLAVCNFSM6AAAAABARN75FOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJZGMYTSOBQGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Another question, did the problem only affect external modules? Any internal modules confirmed to have this problem?
only external MPM and only DSMX with telemetry. In my case Blade130S and Blade 230 smart. External MPM and Frsky X2 LBT with telemetry for example works well.
only external MPM and only DSMX with telemetry. In my case Blade130S and Blade 230 smart. External MPM and Frsky X2 LBT with telemetry for example works well.
Do you think you could catch some sample of the telemetry packets? We have an issue in the parsers that is poorly handled.
Could you please check what is the latest version of EdgeTX that is good, and which version is bad. This can help to identify the problem.
yes I will try. But this will be a hard job.
Here you will find a recording of CPPM (D0) and S.PORT (D1). At the end you can see where CPPM goes low. Since there is definitely a dependency of the error on the activated telemetry, I think it can help to see what data was sent on the S.PORT before. I used Sigrok pulseview for the recording. Spektrum-CPPM-SPORT-CPPMlow.zip
Here you'll find a trace
@raphaelcoeffic have a close look at the la trace. decode it with the mpm sport uart settings 100000,8,e,1 and you'll find bit errors indicated by parity and framing errors. the reason is a high of only 1.8V. disconnecting sport and measuring mpm output directly shows the expected 3.3V. looks like the mpm can't drive a proper high on the sport pin. can you think of a misconfiguration of the radio's uart (pullup/down?).
crosschecking this with my tx16s on 2.9.2 and a rm external mpm shows proper mpm sport levels with or without connection to the radio's sport pin.
I now have an X10S Express from my friend here, and the S.PORT level is exactly the same as mine, only 1.8V.
@raphaelcoeffic @3djc hypothesis about the error this issues describes, the loss of serial communication to the MPM: As the trace shows there are a lot of UART parity and frame errors, i.e. corrupt bytes. Assuming those corrupt bytes make it into the MPM telemetry frame which is then accepted as good MPM telemetry frame, a protocol decoder having no or insufficient means to filter the corrupt protocol frame might cause follow up issues, e.g. buffer overruns which in turn might cause unpredictable other issues.
I have no experience with Spektrum telemetry so I can't if this is a likely scenario. Can you?
crosschecking this with my tx16s on 2.9.2 and a rm external mpm shows proper mpm sport levels with or without connection to the radio's sport pin.
Yeah, maybe some pull-up / pull-down settings, that could be. But not sure why the thing would then fully break after a while...
So two issues Check pull up/pull down settings Fix parser to not crash when receiving bad packets
@raphaelcoeffic
Yeah, maybe some pull-up / pull-down settings, that could be. But not sure why the thing would then fully break after a while...
read my hypothesis? If a protocol decoder doesn't throw out corrupt frames it's just a matter of probability if and when some data combination might be hit that can cause unpredictable errors, like writing to memory it shouldn't. Again just a hypothesis but fueled by the fact that there is no loss of serial if there is no connection made to the S.Port pin (no telemetry processing) or if telemetry is disabled in the settings. It really looks like something wrong fed to the protocol decoder cause this issue.
I believe we are looking at two problematic areas. One being the electrical issuse, the other being the question is the Sepktrum protocol resilient against corrupt data.
talking about probability, check out the vast number of UART warnings. They are all parity and framing errors most likely due to the insufficient high level.
Just to make sure we're all on the same page. My friend gave me an X10S Express with his IRangeX+ MPM and his Blade130S to test. So the complete test setup. I have now tested with his components and the result is exactly the same as with my own components. I am very happy that we have all parts in duplicate and can therefore completely rule out a hardware error.
@ParkerEde Did you try with all sensors deleted, meaning, you connect everything as usual, telemetry is ON, but no sensors are listed?