BlueBus
BlueBus copied to clipboard
Onboard Monitor cuts out momentarily on firmware 1.1.18 and 1.3.0
I have an 2003 E39 with MKIV navigation, TV module, BM54 radio, and no DSP. Problem I'm having is my onboard monitor display cuts out momentarily soon after I start the car. After it cuts back in, the problem doesn't re-occur until the next drive cycle. INPA scan of the car's communications modules is showing the following odd errors: NAV_JAP has 1 error: error 11 service mode on. SES has 116 errors. Each error shows the following: 0 Unknown error location, Error frequency 0, Sporadic error.
Problem with screen cutout and the two communications modules occurs with firmware v1.3.0 and v1.1.18 but all issues go away when BlueBus power is disconnected or if Bluebus is downgraded to v1.1.17.
Ted, I've emailed you about this issue and we've had some correspondence, but I thought this forum would be a better place to investigate it. So let me introduce myself: I'm a long time developer with several of my own open source projects out there. Very familiar with C based development but know nothing about BMW K/I-Bus (at the moment). Fascinated by this project too!
Anyway, I've cloned the repository, installed the IDE and have successfully compiled the BlueBus firmware so everything looks pretty good at this end (you are building with XC16 v1.6.0 and I have XC16 v2.00 installed - let me know if this is not ok). Currently working on understanding the code as it stands, then I'll start digging into this issue. Realize that my E39 configuration may be a bit of a unicorn which could be why the error isn't cropping up everywhere. Current plan is to add empirical data to develop a better understanding. I have navcoder which can log bus traffic so I'll also try to capture activity around the issue.
Can you please confirm, you have the Japanese Navi? I have been compiling using xc16 1.70 and latest PACK 1.8.217. The version 2.0 of XC16 did not result in runnable code for me and also the release notes mention a lot of things changed for that compiler version.
I will let Ted however confirm the version we all should be running to have replicable results.
I wonder if you would be willing to checkout commit by commit from the 1.1.17 till 1.1.18 and compile and run, to help us find out, which of the changes introduced the error.
Definitely have NAVJ. I can query it specifically with INPA. Thanks for the compiler update; I'll fall back to xc16 v1.70. I haven't actually tried to run anything yet; only validated that it built without errors.
Commit checkout from v1.1.17 - v1.1.18 is probably something I can do. There are a bunch, but I can use a binary search testing process to narrow things down pretty quickly. ITMT I've been looking through the code and at first glance there doesn't seem to be anything that would cause the INPA errors I'm seeing. BlueBus doesn't even define the SES and NAVJ devices. Also searching the ibus header file for defined hex codes doesn't turn up any matches which would seem to eliminate a misconfiguration. SES is 0xB0 and NAVJ is 0xBB. I also noticed that BMB screen blanking while driving became an option that eerily described my onBoard monitor symptom, but it appears the code wasn't released in v1.1.18. I need to check that further though. Its only ever used once so I may comment that out and re-build v1.3.0 to see if there's any impact. Seems that the parking light code uses LCM diagnostics to handle halo light DRLs just as it does for the turn signals (nice hack btw). No reason why that would have an impact on my Navigation ECU either since the I-BUS messages only target the LCM.
Going through commits now... Not running anything yet but checking for IBUS_GT_MONITOR_OFF used in the project. Confirmed that the screen blank code wasn't in the 1.1.18 release commit on 11/10/2021 (d4f869d). I believe that I'll load v1.1.18 again and double check that all 3 symptoms (odd errors reported by NAVJ and SES during INPA session, and the BMB cuts out shortly after beginning a drive cycle) are happening again.
@sleuth255,
That is crazy weird. I do not in any capacity emulate the JNAV nor the SES, and I don't respond to diagnostic requests either.
Perhaps they don't like GT probing (Asking the ECU for its identification)? But that code has been around forever at this point.
Please let me know what you find. I will also compile with the newest XC16 and see how badly shit gets screwed up :)
Thanks! -Ted
@tedsalmon is this the issue we spoke about a little while back regarding VM/GT switching?
@piersholt
Nope, that was a different gentleman in New Zealand.
-Ted
Agree that BluBus does not emulate NAVJ or SES in any way. Those devices are not even defined in ibus.h. Based on this, my current hypothesis is that the board monitor cutout and the INPA errors are not related and that I incorrectly attributed the Onboard monitor cutout to v1.3.0 and v1.1.18. Unfortunately, we just got our first snowfall here in WI and the roads have been salted so I may not be driving the BMW for a while. So I'm going to focus on the NAVJ and SES error report. My current hypothesis here is that it might have something to do with the fact that BluBus uses LCM diagnostics to enable parking light DRLs. I'm using INPA special tests to scan the modules and I did have parking light DRLs enabled in both v1.1.18 and v1.3.0.
Empirically speaking, when I ran the INPA module tests, I connected my K+DCAN cable, fired up INPA and put the ignition switch into position 2. This caused BluBus to issue the diagnostic command to the LCM that enabled the parking lights. Then I ran the special test cycle which IDed SES and NAVJ issues. Related? Easy enough to check: Load v1.1.18 and run special tests with DRL comfort feature off, the re-run with DRL comfort feature on. If this toggles the INPA error report then we have root cause at least. It may actually be a bug in the INPA special tests scripts so I'll dig into individual communications module tests next. Heck, it could be that the NAVJ ECU listens for any message on the I-Bus where the sender is the diagnostic device and enables its own service mode if it sees one. This would be very easy to test/validate with navcoder which can send user defined i-bus messages. Not sure about SES since there's no ECU in my vehicle that should be responding to the INPA probe.
I have another dev environment question: Are you using large or small memory model for the build? I'm using small data model which is a build suggestion. About to try and load a compiled version...
uh oh... looks like its bricked. Flash succeeded but won't boot. Bootloader not responding. Had to use the recovery jumper to get a bootloader then restore to v1.1.18. So guess I'll wait for build env confirmation before I try to build another: Code model Large Data model Small PIC24F pack: v1.8.217 XC 16 Compiler version: v1.70 Heap not specified so using heap size of 0.
Results of v1.1.18 INPA testing (fasten your seatbelts!). First off, its related to the parking light DRL comfort feature. With that feature on, the NAVJ and SES errors are recorded by the special test/quick error scan function which scans and goes through all ECUs and records errors. With that feature off, no errors are found. ...and now it gets weird:
With the parking light feature on, INPA can connect to SES (Speech input system SES) and NAVJ (Navigation Computer Japan) ECUs through the E39/Communications Systems menu. In both cases, the ECU menu comes up with a BMW part number that looks the same (and seems invalid?): 4000000.
In both cases, performing a read error memory function from the ECU menu returns the same errors that the special tests script uncovered. This eliminates my hypothesis that a bug in the special test script is causing this.
With the parking light feature off, things get a little bit more strange:
When I try to connect to NAVJ from the Communications Systems/Navigation Computer Japan menu option, INPA tries to connect but after working for a few seconds returns a device configuration error (could be my INPA version causing this). I have to abort script processing at this point. And when I try to connect to the SES via the Communications Systems/Speech input system SES menu option, INPA immediately returns a "NO RESPONSE FROM CONTROLUNIT"error (as it should). Again, I have to abort script processing at this point.
So it appears that the two INPA ECU error findings have something to do with the parking light comfort feature. It also appears that something having a BMW Hardware code of 4000000 is responding to INPA's ECU probe for those two devices when the parking light feature is on which isn't responding when the parking light feature is off. Once I can build an executable environment I'll start investigating this in more depth.
More SES investigation: With the parking light option on, I began using INPA to investigate the SES ECU it had found. Ident returned mostly zeros for manufacturer, date etc. but this is interesting: while working with the SES in INPA, I turned off the parking lights feature. The parking lights went out and INPA immediately threw a "NO RESPONSE FROM CONTROLUNIT" error.
More SES investigation: With the parking light option on, I began using INPA to investigate the SES ECU it had found. Ident returned mostly zeros for manufacturer, date etc. but this is interesting: while working with the SES in INPA, I turned off the parking lights feature. The parking lights went out and INPA immediately threw a "NO RESPONSE FROM CONTROLUNIT" error.
I'll have to follow up later, but I understand now. INPA is dumb and doesn't expect multiple diagnostic messages to fly back and fourth while it's interrogating an ECU. I'll be they're not parsing the source device address whatsoever, and since the response from an ECU is a static 0xA0
, they assume the diagnostic packet they see is something they requested.
I assume you were able to recover the unit using the "RECOVERY" jumper? :)
-Ted
For reference, here's an IDENT response from the Nav:
3B 22 3F A0 36 39 30 38 35 32 39 31 30 30 34 30 34 30 39 34 39 30 30 30 31 30 33 37 30 38 2E 32 31 36 33 AF
Here's a coding block readback response from the Nav:
3F 06 3B 06 00 00 10 14
3B 0C 3F A0 00 00 82 84 86 00 82 84 86 A8
Yup, recovered just fine to v1.1.18. Just need to figure out why flashing my test build bricked it. The HEX file for my test build was the same size as the production v1.1.18 hex file so figured I was ok. Do my buildENV settings from above look reasonable? May be heap size needs to be set? I did scan the project files for malloc statements and didn't find any so I assumed heap wasn't being used. Is there a base load address or entry point setting that I need to configure?
Ok... interesting data coming up. I successfully compiled version 1.3.1 with a quick change just to make sure I didn't get installed versions mixed up (settings/about says "FW: 1.3.1 (Sleuth)").
Here's the interesting part: while I was checking that settings/about did indeed display "FW: 1.3.1 (Sleuth)" (it did) the display cut out and after a few seconds came back on again. In short, the precise Onboard Monitor problem I've been troubleshooting. So, this issue appears to be unique to the v1.3.x build. The INPA error conditions are red herrings related to the parking light option that was introduced in v1.1.18. Next up: comment out the one place where this code stream turns off the display. Can't imagine how it would be doing it there though because speed needs to be > 5mph
BTW, my buildENV is:
Code model Large Data model [default] PIC24F pack: v1.8.217 XC 16 Compiler version: v1.60 Heap not specified so using heap size of 0.
As I figured, commenting out lines 768-774 in handler_ibus.c had no effect. Cutout still happens. Just to dot all my "i"'s and cross all my "t"s, I down revved to v1.1.18, performed the exact same sequence of events to reach settings/about and validated that the screen cutout does not happen on this build.
I'll keep looking at the code trying to answer the question: "Does BlueBus v1.3.x now put the BMBT into diagnostic mode for this feature"?
More SES investigation: With the parking light option on, I began using INPA to investigate the SES ECU it had found. Ident returned mostly zeros for manufacturer, date etc. but this is interesting: while working with the SES in INPA, I turned off the parking lights feature. The parking lights went out and INPA immediately threw a "NO RESPONSE FROM CONTROLUNIT" error.
You're going to get all kinds of buggery if you run diagnostics in parallel with BlueBus features that utilise diagnostic commands.
INPA (or more specifically, EDIABAS) is a blunt force instrument. It's domineering, impatient, and fragile. A nominal volume of bus traffic is enough to trigger a tantrum, let alone a tardy module that's preoccupied with emulated diagnostic commands. Also, as the bus system neared EOL, the modules started becoming more interdependent, meaning the diagnostics for one module can inadvertently fail owing to a different device being busy, which of course only serves to exacerbate this fragility.
while working with the SES in INPA, I turned off the parking lights feature. The parking lights went out and INPA immediately threw a "NO RESPONSE FROM CONTROLUNIT" error.
I suspect this is an example of the above.
the display cut out and after a few seconds came back on again. In short, the precise Onboard Monitor problem I've been troubleshooting.
Can you clarify what you mean by 'display cut out'? Are you referring to the video signal going AWOL (GT), on the monitor itself (BMBT) powering off?
I'd recommend not digging too deep into the code off the bat. Even are a number of years we're still finding seemingly unrelated, innocuous commands that can affect functionality, or do all kinds of strange nonsense! 😄
The display appears to power off completely then come back on a few seconds later showing the same screen as before. Not seeing anything in the code that leaps out at me either. A feature to blank the BMBT display while driving first appeared in v1.3.0. The command to do this is on line 772 of handler_ibus.c. I can't see anything about that code that would cause the command to be executed other than if the comfort option was set and the vehicle speed was > 5mph though. I'm going to monitor the I-Bus data stream around the cutoff event next. You can almost use a stopwatch to predict when the cutout will occur after turning the ignition to pos 1.
I think that a good example of your module interdependency description might also be the TV and NAV devices which hand off control of the GT device. I use my TV module to handle a backup camera.
@sleuth255
Your dev options sound good; there’s no heap memory in use, and the large data model is in use because… I never changed it :)
I'll keep looking at the code trying to answer the question: "Does BlueBus v1.3.x now put the BMBT into diagnostic mode for this feature"?
The BlueBus simply emulates the command sent by the GT when you select “Monitor off” in the main menu.
definitely log ibus traffic when the issue occurs, and we’ll be able to find the trigger.
Thanks! -Ted
Well this is a bother: navcoder can't snoop the I-Bus when its using a K+DCAN cable. It appears to only be seeing traffic that it generates. Looks like I'm going to need to build an adaptor cable with a direct I-Bus tap next.
Well this is a bother: navcoder can't snoop the I-Bus when its using a K+DCAN cable. It appears to only be seeing traffic that it generates. Looks like I'm going to need to build an adaptor cable with a direct I-Bus tap next.
Why not snoop the traffic using the BlueBus? set log ibus on
;)
Ha! Never thought to even look at the BluBus CLI... Thought the UART connection was for flashing only lol. Setting things up now...
Still working to get a trace; couldn't get it to fail initially... Finding out how added another piece to the puzzle: the problem only happens after the car wakes up from sleep mode. Takes almost exactly 30 seconds to cut out too. To capture the error will be a bit tricky because I need to first disconnect the USB cable from my laptop so that BlueBus also goes to sleep. So: wait 15 minutes, open door and start timer. Put key in ign and turn to pos 1. Connect USB to laptop, then restart putty session at about the 10 seconds to go mark. Fortunately it looks like the ibus log setting persists through a reboot.
Got it! The screen cut out right after this happened: [44430] DEBUG: IBus: RX[7]: E8 05 D0 59 11 02 77 [44521] DEBUG: IBus: RX[5]: F0 03 68 01 9A [44533] DEBUG: IBus: RX[6]: 68 04 BF 02 00 D1 [44753] DEBUG: IBus: RX[5]: 68 03 6A 01 00 [45007] DEBUG: IBus: RX[5]: 3F 03 D0 0B E7 [SELF] [45060] DEBUG: IBus: RX[37]: D0 23 3F A0 C1 C0 04 00 00 01 20 00 00 AA 00 00 00 00 00 E4 00 00 4C 0C 00 DD 00 00 00 00 00 00 00 FF FF 00 9B [45105] DEBUG: IBus: RX[8]: 7F 06 C8 A9 03 30 30 1B [45195] DEBUG: IBus: RX[8]: 7F 06 C8 A9 0A 30 30 12 [46761] DEBUG: IBus: RX[7]: 68 05 18 38 00 00 4D [46791] DEBUG: IBus: RX[5]: 3F 03 3B 00 07 [SELF] [46798] DEBUG: IBus: RX[16]: 18 0E 68 39 02 89 00 01 00 01 01 00 01 01 01 CC [SELF] [46870] DEBUG: IBus: RX[36]: 3B 22 3F A0 39 31 31 35 30 33 33 31 30 30 38 30 36 31 34 32 31 30 33 30 31 34 35 30 36 37 2E 31 30 30 30 AE
The screen cut back in right after this happened:
IBus: GT P/N: 9115033 DI: 6 HW: 10 SW: 0 Build: 21/03 [46896] DEBUG: IBus: RX[5]: 3F 03 3B 11 16 [SELF] [46915] DEBUG: IBus: RX[7]: ED 05 F0 4F 02 11 44 [46935] DEBUG: IBus: RX[14]: 3B 0C 3F A0 42 4D 57 43 30 31 53 00 00 E1 [49398] DEBUG: IBus: RX[13]: 7F 0B 80 1F 40 21 42 02 00 04 20 03 ED [49768] DEBUG: IBus: RX[5]: 3B 03 CA 01 F3 [49867] DEBUG: IBus: RX[17]: 3F 0F D0 0C 00 00 00 00 00 01 20 00 00 E4 00 00 29 [SELF] [49878] DEBUG: IBus: RX[5]: D0 03 3F A0 4C [54479] DEBUG: IBus: RX[7]: E8 05 D0 59 11 02 77 [54572] DEBUG: IBus: RX[5]: F0 03 68 01 9A [54584] DEBUG: IBus: RX[6]: 68 04 BF 02 00 D1 [54592] DEBUG: IBus: RX[5]: ED 03 80 10 7E [54625] DEBUG: IBus: RX[6]: 80 04 BF 11 01 2B [54635] DEBUG: IBus: RX[7]: ED 05 F0 4F 12 11 54 [54643] DEBUG: IBus: RX[5]: ED 03 80 14 7A [54673] DEBUG: IBus: RX[16]: 18 0E 68 39 02 89 00 01 00 01 01 00 01 01 01 CC [SELF] [54683] DEBUG: IBus: RX[6]: C8 04 E7 2B 10 10 [SELF] [54698] DEBUG: IBus: RX[9]: 80 07 BF 15 02 F7 7A 42 E0 [55218] DEBUG: IBus: RX[8]: 7F 06 C8 A9 03 30 30 1B [55308] DEBUG: IBus: RX[8]: 7F 06 C8 A9 0A 30 30 12 [57608] DEBUG: IBus: RX[6]: ED 04 F0 4A 90 C3 [57622] DEBUG: IBus: RX[6]: F0 04 68 4B 05 D2
In fact, these two sets were the only messages that were captured. I literally got the putty session restarted just before the screen blanked out. So the entire capture looks like this:
[44430] DEBUG: IBus: RX[7]: E8 05 D0 59 11 02 77 [44521] DEBUG: IBus: RX[5]: F0 03 68 01 9A [44533] DEBUG: IBus: RX[6]: 68 04 BF 02 00 D1 [44753] DEBUG: IBus: RX[5]: 68 03 6A 01 00 [45007] DEBUG: IBus: RX[5]: 3F 03 D0 0B E7 [SELF] [45060] DEBUG: IBus: RX[37]: D0 23 3F A0 C1 C0 04 00 00 01 20 00 00 AA 00 00 00 00 00 E4 00 00 4C 0C 00 DD 00 00 00 00 00 00 00 FF FF 00 9B [45105] DEBUG: IBus: RX[8]: 7F 06 C8 A9 03 30 30 1B [45195] DEBUG: IBus: RX[8]: 7F 06 C8 A9 0A 30 30 12 [46761] DEBUG: IBus: RX[7]: 68 05 18 38 00 00 4D [46791] DEBUG: IBus: RX[5]: 3F 03 3B 00 07 [SELF] [46798] DEBUG: IBus: RX[16]: 18 0E 68 39 02 89 00 01 00 01 01 00 01 01 01 CC [SELF] [46870] DEBUG: IBus: RX[36]: 3B 22 3F A0 39 31 31 35 30 33 33 31 30 30 38 30 36 31 34 32 31 30 33 30 31 34 35 30 36 37 2E 31 30 30 30 AE
~10-15 second delay
IBus: GT P/N: 9115033 DI: 6 HW: 10 SW: 0 Build: 21/03 [46896] DEBUG: IBus: RX[5]: 3F 03 3B 11 16 [SELF] [46915] DEBUG: IBus: RX[7]: ED 05 F0 4F 02 11 44 [46935] DEBUG: IBus: RX[14]: 3B 0C 3F A0 42 4D 57 43 30 31 53 00 00 E1 [49398] DEBUG: IBus: RX[13]: 7F 0B 80 1F 40 21 42 02 00 04 20 03 ED [49768] DEBUG: IBus: RX[5]: 3B 03 CA 01 F3 [49867] DEBUG: IBus: RX[17]: 3F 0F D0 0C 00 00 00 00 00 01 20 00 00 E4 00 00 29 [SELF] [49878] DEBUG: IBus: RX[5]: D0 03 3F A0 4C [54479] DEBUG: IBus: RX[7]: E8 05 D0 59 11 02 77 [54572] DEBUG: IBus: RX[5]: F0 03 68 01 9A [54584] DEBUG: IBus: RX[6]: 68 04 BF 02 00 D1 [54592] DEBUG: IBus: RX[5]: ED 03 80 10 7E [54625] DEBUG: IBus: RX[6]: 80 04 BF 11 01 2B [54635] DEBUG: IBus: RX[7]: ED 05 F0 4F 12 11 54 [54643] DEBUG: IBus: RX[5]: ED 03 80 14 7A [54673] DEBUG: IBus: RX[16]: 18 0E 68 39 02 89 00 01 00 01 01 00 01 01 01 CC [SELF] [54683] DEBUG: IBus: RX[6]: C8 04 E7 2B 10 10 [SELF] [54698] DEBUG: IBus: RX[9]: 80 07 BF 15 02 F7 7A 42 E0 [55218] DEBUG: IBus: RX[8]: 7F 06 C8 A9 03 30 30 1B [55308] DEBUG: IBus: RX[8]: 7F 06 C8 A9 0A 30 30 12 [57608] DEBUG: IBus: RX[6]: ED 04 F0 4A 90 C3 [57622] DEBUG: IBus: RX[6]: F0 04 68 4B 05 D2
Hmmm... looks BlueBus is sending something to the GT at 46791 and its responding at 46870 just before the cutout. Ident? Then ident response is logged right around where screen cuts back in.
Well this is a bother: navcoder can't snoop the I-Bus when its using a K+DCAN cable. It appears to only be seeing traffic that it generates. Looks like I'm going to need to build an adaptor cable with a direct I-Bus tap next.
If you're connecting to the diagnostics port (20-pin or OBD), you'll only get D-Bus traffic. D-Bus uses a different protocol, which is a discussion for another time.
You can however still use the interface on K/I-Bus as (like most buses of this era of BMWs), it's a derivative of ISO 9141 K-Line. You just need three pins from the interface, 12V, Ground, and K-Line. The latter- K-Line, just being substituted for whatever bus you connect to.
Hmmm... looks BlueBus is sending something to the GT at 46791 and its responding at 46870 just before the cutout. Ident? Then ident response is logged right around where screen cuts back in.
You're spot on.
[46791] DEBUG: IBus: RX[5]: 3F 03 3B 00 07 [SELF]
00
is a diagnostic request for IDENT.
[46870] DEBUG: IBus: RX[36]: 3B 22 3F A0 39 31 31 35 30 33 33 31 30 30 38 30 36 31 34 32 31 30 33 30 31 34 35 30 36 37 2E 31 30 30 30 AE
A0
is a diagnostic reply- essentially it's 'success'.
You're obviously au fait on the development side, but as a quick note on the protocol- the commands, i.e. 0x00
through 0xff
, are in a global scope; it should do the same thing every time, on every bus, be it K, I, or D. Now, it wouldn't be BMW without some exceptions, but with respect to these commands, you'll only ever seen them used in a diagnostics.
[46915] DEBUG: IBus: RX[7]: ED 05 F0 4F 02 11 44
This is the TV module (0xED
) turning the BMBT (0xF0
) display off.
[54635] DEBUG: IBus: RX[7]: ED 05 F0 4F 12 11 54
This is the TV module turning the BMBT display on.
~~Errr....~~
~~[46896] DEBUG: IBus: RX[5]: 3F 03 3B 11 16 [SELF]
~~
~~We got a buffer underrun!~~
Edit: Ahah, literally just proved my own point on the global scope exceptions. Ignore this.