edgetx icon indicating copy to clipboard operation
edgetx copied to clipboard

FrSky X10S Express: dangerous spike on channel associated with RV stick using ELRS 900 MHz modules on EdgeTX V2.11 (main branch)

Open robustini opened this issue 1 year ago • 26 comments

Is there an existing issue for this problem?

  • [X] I have searched the existing issues

What part of EdgeTX is the focus of this bug?

Transmitter firmware

Current Behavior

Unfortunately this is not at all nice, having channel spikes going to full scale on RV stick means for the pilot in MODE1 to have the throttle going to full randomly. Fortunately we have ascertained that this only happens using firmware compiled from the main branch, while version 2.10.x is not affected for now. Tests were performed with two FrSky Horus X10S Express and EMAX Aeris, FrSky R9M and Radiomaster Nomad modules at 900 MHz with ELRS V3.5.1. No problems were encountered with any 2.4 GHz ELRS module tested here (BetaFPV and EMAX). Definitely related to an RF problem that apparently exist with 2.11, so some commit emphasized this anomaly. In my case even at 10 mW power of the R9M module the problem is evident, with the Nomad it appears with much higher power. Some videos showing the problem:

https://github.com/user-attachments/assets/c5fed020-f9bf-4ba3-8dff-94e7c14aab09

https://github.com/user-attachments/assets/312a2fec-c903-4713-8a9c-318a4a6ec84e

Expected Behavior

Obviously the analog hall sensor on RV maybe has something at the connectivity level on the main board or something else that makes it more sensitive to RF noise but the fact remains that with EdgeTX V2.10.x this thing does not happen. I have tried swapping the two FrSky M10 gimbals and the problem still remains on RV.

Steps To Reproduce

The posted videos show the procedure, in my case CH2 starts to show spikes only when the rx and module start to colloquy, so when RF comes into play. The problem is highlighted by using in ELRS 16 channels with telemetry ratio at 1:2.

Version

Nightly (Please give date/commit below)

SHA 34856136c9eb54ca7fcb0b9d547c485bb9972841

Transmitter

FrSky X10 Express / X10S Express (ACCESS)

robustini avatar Oct 14 '24 05:10 robustini

I don't think you have proven in any way that a problem exist in 2.11. You have proven through your research that you have hardware based RF issue, that's the bottom line. They do manifest in 2.11 (more ? Given there are plenty of elrs settings where this doesn't happen, it cannot be excluded that some exist in earlier version that you haven't seen, it all a question of timing), maybe so, likely due to timing between radio processing and RF perturbation, it still doesn't make it a software issue warranting software fixing.

The only true fixing need to be a to find how to shield your radio against 900MHz RF spikes, imho

3djc avatar Oct 14 '24 06:10 3djc

@3djc certainly on the shielding problem I can agree with you, this TX apparently in that section has been poorly maintained and there is a hardware problem. But if the software version emphasizes it a minimum of investigation should be done in my opinion, i mean definitely on the hardware but also on the software. If then it is not possible we will change TX or stay at 2.10, problem not solved but curbed. My issue is to prevent someone from one day tomorrow going the throttle of a plane to full throttle unintentionally on the ground, and perhaps with hands on the propeller.

robustini avatar Oct 14 '24 06:10 robustini

I repeated the same tests on Radiomaster TX16S and FrSky X12S and no problems found. So absolutely agree that the X10S bickers with the RF of the ELRS modules at 900 MHz. But curiosity remains: what in 2.11 makes this thing pop up? Why is it absolutely stable with 2.10? Is it possible to fix it via software? If we can't via software how to solve it at the hardware level? I am not an ostrich who sticks my head in the sand when faced with a problem, if possible I look for a solution (if there is a possible solution).

robustini avatar Oct 14 '24 06:10 robustini

If this is a RF issue, which it entirely appears to be, the only way to resolve it - other than using a magic version of firmware that just happens to with that exact hardware layout, environmental factors (temperature, humidity, etc) and probably even compiler version to generally not exhibit any symptoms - is to shield the RF section from the rest of the radio electronics... ie. insulated copper/aluminium shield, or wrap some shielding around the gimbal wiring. IIRC, I have seen other users do exactly this with the X12... going as far as to basically foil a large portion of the inner shell.

pfeerick avatar Oct 14 '24 07:10 pfeerick

Yes Peter, the intention of this report is obviously to warn and find a solution if possible, not to say "2.11 is no good" but "2.11 with this hardware combo is potentially dangerous", this is the gist of the issue. If we know the problem we can avoid it. And if anyone is able to find a solution, please come forward!

robustini avatar Oct 14 '24 07:10 robustini

If it is possible to isolate at which commit in 2.11 the change happens it might be possible to identify why it behaves differently.

But this will take e lot of compiling different versions and thorough testing.

philmoz avatar Oct 14 '24 08:10 philmoz

If it is possible to isolate at which commit in 2.11 the change happens it might be possible to identify why it behaves differently.

But this will take e lot of compiling different versions and thorough testing.

That's what I was trying to do, at the moment I went backwards to August 1 and the problem is still there. Since 2.10 was released at the end of August it's likely that that commit maybe with 2.11 is still being ignored. I continue the hunt, I'm pretty sure I can find it, the problem occurs immediately.

robustini avatar Oct 14 '24 08:10 robustini

Er... 2.10.0 was released May 12, but would have forked off of main some time before that. If you are building the code yourself, you should be to narrow down to the commit it changes relatively quickly using git bisect (while also making sure to run git submodule update --init --recursive each step to keep the submodules properly in sync).

While it's not the sort of thing you normally want, having so obvious an ... issue ... is sometimes helpful!

pfeerick avatar Oct 14 '24 08:10 pfeerick

2.10.x has been forked the 28. March 2024 image

neuhausj avatar Oct 14 '24 08:10 neuhausj

2.10.x has been forked the 28. March 2024

Then the hunt gets much more complicated; 2.10 has 474 commits behind main.

robustini avatar Oct 14 '24 09:10 robustini

2.10.x has been forked the 28. March 2024

Then the hunt gets much more complicated; 2.10 has 474 commits behind main.

Using bisect it should take about 9 tests.

ajjjjjjjj avatar Oct 14 '24 09:10 ajjjjjjjj

2.10.x has been forked the 28. March 2024

Make that 28 Feb ;) ... 315104c2ba9ee61edf183bed283d45f9a6c856ae should be the first unique commit for 2.10, and 57c7aca2f686a340b6ea6af7f5fcedfbae4425b3 the first common commit for 2.10 and 2.11

i.e. using the latest commit to main as the bad reference, and assuming (but would obviously need to be tested also that the common commit to both 2.10 and 2.11 is still good)

git bisect start
git bisect bad dd482f4246f09f140e54325f81d66f6edf49038f
git bisect good 57c7aca2f686a340b6ea6af7f5fcedfbae4425b3
Bisecting: 236 revisions left to test after this (roughly 8 steps)

git bisect in use => https://dev.to/alvesjessica/using-git-bisect-to-find-the-faulty-commit-25gf

pfeerick avatar Oct 14 '24 09:10 pfeerick

Understood the logic of operation, convenient though....

git bisect good 57c7aca2f686a340b6ea6af7f5fcedfbae4425b3 Bisecting: 236 revisions left to test after this (roughly 8 steps) warning: unable to rmdir 'radio/src/thirdparty/lvgl': Directory not empty warning: unable to rmdir 'radio/src/thirdparty/stb': Directory not empty [b644aadbaa8067640b6fddc0f37e132b0e566c68] fix(cpn): binary import of multipos switches (#5169)

So i think git submodule update --init --recursive is obviously needed at each step.

robustini avatar Oct 14 '24 11:10 robustini

I will try to tell you about it, I hope you believe me but I have repeated the test countless times on each firmware, setting the module to 500 mW. This is the list of firmware tested with the related SHA in the name:

1_b644aadbaa8067640b6fddc0f37e132b0e566c68.bin 2_4a0308a46fe02edb12df3272e64160dbc27fb7c5.bin 3_982660e8c8b15b48d5cd8fbbca88497e88bf7600.bin 4_6e706c4b1cc9d7568181bda22e08bdd52ba0f773.bin 5_471f87fc2a4eec9b127d9ff125645d420c33e785.bin 6_a4d93c693160076583fd80cf1126ba7124a54259.bin 7_fcc75bd6b74293f5cdb1802dbed94e335e2b0344.bin 8_49734e433693a9d7853fa875bb602c2bdae20851.bin 9_34856136c9eb54ca7fcb0b9d547c485bb9972841.bin

Do you want to have a laugh? The problem does that with everyone. Then I try again the firmware compiled from branch 2.10 and 100% the problem is not there. So now? Is this something that is so old that therefore it will not be considered in future releases? Possible. However it is boring because I will not be able to constantly test the main with this tx.

robustini avatar Oct 14 '24 12:10 robustini

I could continue the hunt by marking the first one on the list as bad and try to go way back, to 2.9, let's go.

robustini avatar Oct 14 '24 12:10 robustini

Did you try compiling 2.10 before or did you use the builds provided by us? Maybe it is an issue with your compiler version

gagarinlg avatar Oct 14 '24 14:10 gagarinlg

Did you try compiling 2.10 before or did you use the builds provided by us? Maybe it is an issue with your compiler version

Compilation problems? I don't think so, I always compile 2.10.5 myself and it works perfectly. I compile these radio firmwares I believe from early versions of OpenTX for Taranis X9D. Then of course we can always be wrong, but I rule it out.

robustini avatar Oct 14 '24 16:10 robustini

I don't think 'problem' is the right word. As it is all about timing, different compilers can produce various levels of optimisation and therefore affect timings. This is also why it so difficult to try to fight this types issues at code level

3djc avatar Oct 14 '24 16:10 3djc

I don't think 'problem' is the right word. As it is all about timing, different compilers can produce various levels of optimisation and therefore affect timings. This is also why it so difficult to try to fight this types issues at code level

The test on the other X10S EXpress was done by another EdgeTX developer, I don't think we are all compiling wrong. I agree about “problem,” not perhaps the correct word, more simply anomaly caused by possible multiple factors. I have stopped the hunt, awaiting more feedback if there is any when 2.11 is released.

robustini avatar Oct 14 '24 17:10 robustini

I never said you were compiling wrong, but we alread had issues with different compiler versions, like vastly different sizes of the binary and other stuff, so it is good to be sure that you did all tests using the same compiler version.

gagarinlg avatar Oct 14 '24 18:10 gagarinlg

I never said you were compiling wrong, but we alread had issues with different compiler versions, like vastly different sizes of the binary and other stuff, so it is good to be sure that you did all tests using the same compiler version.

I usually compile 2.10.x firmware myself and there is not this problem, it is not related to the compiler.

robustini avatar Mar 19 '25 11:03 robustini

I never said you were compiling wrong, but we alread had issues with different compiler versions, like vastly different sizes of the binary and other stuff, so it is good to be sure that you did all tests using the same compiler version.

I usually compile 2.10.x firmware myself and there is not this problem, it is not related to the compiler.

I had the same problem with ch3 in mode1.

Is there a Solution?

Sven

Fiatx19 avatar May 25 '25 09:05 Fiatx19

Possibly the same as #6083. Fix has been cherry-picked on 2.11 branch (see 4ddafc1fdfccd74529399ebc1072cc575c335389 and #6283) and will be part of v2.11.2.

raphaelcoeffic avatar May 25 '25 09:05 raphaelcoeffic

Great, glad to see I didn't dream it up! :P

robustini avatar May 25 '25 10:05 robustini

Great, glad to see I didn't dream it up! :P

Did you test the fix? Does it work for you?

raphaelcoeffic avatar May 25 '25 10:05 raphaelcoeffic

Great, glad to see I didn't dream it up! :P

Did you test the fix? Does it work for you?

I will try tomorrow, I had stayed with 2.10.x precisely because of the non-resolution this problem.

robustini avatar May 25 '25 11:05 robustini

Also relevant: https://github.com/EdgeTX/edgetx/issues/6404

@robustini did you re-test? In case the detection is failing for you as well, you might wanna try to firmware I posted in https://github.com/EdgeTX/edgetx/issues/6404#issuecomment-3219672970.

raphaelcoeffic avatar Aug 25 '25 13:08 raphaelcoeffic

Also relevant: #6404

@robustini did you re-test? In case the detection is failing for you as well, you might wanna try to firmware I posted in #6404 (comment).

No, sorry mate, I'm still on 2.10.6 due to problems with Yaapu and a lack of sensors. 2.11 has broken a lot of things in this regard. But I already have a MicroSD with 2.11 ready, so I can try again.

robustini avatar Aug 25 '25 19:08 robustini

Fixed in #6550.

robustini avatar Aug 29 '25 19:08 robustini