EngineDriver icon indicating copy to clipboard operation
EngineDriver copied to clipboard

Android ProcessLifecycleOwner to improve activity to background

Open leohumnew opened this issue 4 months ago • 31 comments

Issue: If the app, for any reason, goes to the background long enough for Android to kill the process, then it will lose connection to the DCC controller (DCC-EX for example). It will then fail to reconnect when the app is re-focused, requiring a full app restart (without this being clearly obvious). This isn't great - and also any captured engines will still be running and not easily stoppable.

Potential Solutions: While not familiar with the current codebase, Android's ProcessLifecycleOwner broadcasts events when the app is going to be sent to the background and when brought back to the foreground. These events could be usable to improve most of these issues (without needing the effort of making the app background runnable with services etc):

  • When the app is delegated to the background, currently captured trains could be auto-stopped (this could be a setting maybe?)
  • When the app is returned to the foreground, it could properly handle it - e.g. cleaning up the old terminated connection and throwing you back to the connection selection screen - so that a full app restart is not required.

Reasoning: Due to its great array of features, Engine Driver is a popular choice for DIY DCC layouts. These layouts are not always used solely by people experienced with the software and setup. Inevitably, even if instructed otherwise, people will (and do) switch to look at a message, search something online, etc. And even for those more experienced with it - phone calls, urgent messages, etc happen. A lot of people in this hobby space may also be less familiar/comfortable with technology in general, so this significant added friction is not great. I believe this would be a quality of life feature that would greatly reduce annoyances, issues, and runaway trains, while simultaneously not requiring e.g. a huge rewrite of all the networking logic to use background services etc.

leohumnew avatar Aug 13 '25 20:08 leohumnew

Hi Leo, Please provide recreation steps that consistently show the issue you are describing.

Note that train-stopping should be a server-side function, if the server hasn't heard from the client within the specified seconds, regardless of reason, the server should stop the related trains. Is this not occurring with DCC-EX? It should not depend on the client, since the client is (by definition) not communicating. App backgrounding is only one of many possibilities. Could be wifi outage/interference, device turned off, moved away, etc.

Keep in mind that the EngineDriver codebase is quite old (16+ years!), and Android OS has changed a LOT! So changes that might seem straightforward just looking at the documentation are often anything but. Particularly something that sounds like it replaces the foundation of the app with something different.

If you have familiarity with using ProcessLifecycleOwner and how we might migrate to it, please create a PR or advise specifics.

@flash62au may have some more information, IIRC he discussed this option with me, but I don't recall the outcome.

--SteveT

mstevetodd avatar Aug 13 '25 21:08 mstevetodd

Hi Steve, thanks for the detailed reply!

I'll see if I can screen record a video of the issues at some point, but here's what happens for me:

  1. Start the app, connect to my DCC EX command station
  2. Select an engine and start running it
  3. Move app to the background long enough for Android to decide to kill the process - the train continues running
  4. Bring the app back into the foreground - the app indefinitely shows the connecting screen

The command station doesn't seem to stop trains on throttle disconnect - I wonder if it's a limitation of the DCC EX command protocol?

Keep in mind that the EngineDriver codebase is quite old (16+ years!)

For sure! Not suggesting it would be a 10 minute job. More so that it could potentially be a QoL feature that would be worth investing in - of course I can speak only for myself here, but these disconnection issues happen quite frequently!

Particularly something that sounds like it replaces the foundation of the app with something different.

Making use of ProcessLifecycleOwner would (I imagine) likely not require any major changes to anything - I unfortunately only have surface-level knowledge of it but I imagine ED could be set up to listen for the backgrounding/foregrounding events and then perform some actions from that (e.g. throw the user back to the network/controller selection screen on foregrounding if we know the connection has been lost).

please create a PR or advise specifics

Sadly my Android dev experience is lacking - if I have time over Christmas to experiment a bit I may return to this and see if there's anything I can contribute to ED though!

And just to clarify - I'm not trying to suggest this is a huge critical issue that requires immediate attention - more so just hoping to surface this concern/idea so that it could potentially be looked at in the future as capacity allows 🙂

leohumnew avatar Aug 13 '25 22:08 leohumnew

We currently use the ApplicationLifeCycleHandler to know when ED is going into or out of background. That was done before my time so have not done any work with it. I did some related changes to track/check if it 'currently' is in background, but not the actual events.

I am not sure what the ProcessLifecycleOwner provides over ApplicationLifeCycleHandler but I will look into it when I can.

Optionally stopping your locos when you go into background would be an easy change.

Restarting the connection when coming out of background is not an easy change. It comes back to a fundamental design of the connection screen. This would need to re-thought/re-worked.

flash62au avatar Aug 13 '25 22:08 flash62au

Seems like stopping on connection loss or heartbeat loss is already implemented in WiT?  I am confused why this doesn't take care of the that aspect for Leo, unless it isn't enabled on the Jmri side?  Maybe I don't recall correctly... Robin

Robin Becker San Diego CA

Aug 13, 2025 6:44:19 PM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186055481]

We currently use the ApplicationLifeCycleHandler to know when ED is going into or out of background. That was done before my time so have not done any work with it. I did some related changes to track/check if it 'currently' is in background, but not the actual events.

I am not sure what the ProcessLifecycleOwner provides over ApplicationLifeCycleHandler but I will look into it when I can.

Optionally stopping your locos when you go into background would be an easy change.

Restarting the connection when coming out of background is not an easy change. It comes back to a fundamental design of the connection screen. This would need to re-thought/re-worked.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186055481], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY6GA4HZ6LG4INDDAHL3NO5UDAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBWGA2TKNBYGE]. You are receiving this because you are subscribed to this thread.

n3ix avatar Aug 13 '25 23:08 n3ix

Stopping on connection loss is different to stopping when going into background.

flash62au avatar Aug 13 '25 23:08 flash62au

Agreed but the comments were also about getting killed in background and then being unable to come back up and stop the train.  My point was there is already an option to take care of that.

ED already has an option to stop on Phone call correct?  So that case is already covered too.  It's been awhile since I wrote the code, but I think you are correct that it probably would be easy to optionally stop on background too. 

Aug 13, 2025 7:25:58 PM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186166050]

Stopping on connection loss is different to stopping when going into background.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186166050], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY52QJX7SQ5ER2DXKI33NPCQJAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBWGE3DMMBVGA]. You are receiving this because you commented.

n3ix avatar Aug 14 '25 00:08 n3ix

A phone call is the only reason why ED goes into background.

I'll start looking being able to relaunch the connection screen at any time. That has been on the TODO list since before my time. That needs to be possible before anything can be done about a more graceful restart after a disconnect.

flash62au avatar Aug 14 '25 03:08 flash62au

sorry, that was supposed to be... A phone call is NOT the only reason why ED goes into background.

flash62au avatar Aug 14 '25 03:08 flash62au

Yes, but between configuring WiT to stop on disconnect and configuring ED to stop on phonecall many of the cases can already be covered by users of desired.  Probably why we never pushed further.  Also probably because I don't use either of those :)   If you can figure out the reconnection that would be great.

Just as aside.  The reconnection problem exists in JMRI itself and has no solution to date.  Unplug or otherwise lose a connection to a command station via USB / serial and a restart is required to recover.  And of course locos do not stop when this happens!

Aug 13, 2025 11:56:43 PM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186730055]

sorry, that was supposed to be... A phone call is NOT the only reason why ED goes into background.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3186730055], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY27QJXEIXNR2BTKKFD3NQCHTAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBWG4ZTAMBVGU]. You are receiving this because you commented.

n3ix avatar Aug 14 '25 11:08 n3ix

I don't expect will solve the disconnect problems.
I can only hope to make the reconnection process cleaner.

flash62au avatar Aug 14 '25 11:08 flash62au

Yes that was what I meant.  At the time I did the code, the goal was to fix things so that ED could actually be restarted after getting killed in the background by Android.  Before that prior instance pieces were left around in a state that caused issues.  That got fixed and reconnection worked via the usual steps for some years.  Not sure what is going on now? Robin

Robin Becker San Diego CA

Aug 14, 2025 7:52:29 AM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3188178703]

I don't expect will solve the disconnect problems. I can only hope to make the reconnection process cleaner.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3188178703], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFYYRK5GRHQTVIELIHJ33NRZ7XAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBYGE3TQNZQGM]. You are receiving this because you commented.

n3ix avatar Aug 14 '25 12:08 n3ix

I think the major issue is that the recent versions of Android are far more aggressive about killing, or at least throttling, apps in background.

flash62au avatar Aug 14 '25 13:08 flash62au

Interesting.  Hopefully the basic lifecycle handling is still ok.  I recall spending quite a bit of time getting that sorted so that TA et al would exit properly before getting killed.  That was what allowed ED to be relaunched properly after getting killed.  It will be interesting to see what you find.

Aug 14, 2025 9:04:08 AM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3188381655]

I think that major issue is that the recent versions of Android are far more aggressive about killing, or at least throttling, apps in background.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3188381655], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY3B5ZTTUS7ZNW6C5SD3NSCMNAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBYGM4DCNRVGU]. You are receiving this because you commented.

n3ix avatar Aug 14 '25 19:08 n3ix

Thanks everyone for all the insights on this!

configuring WiT to stop on disconnect

Hmm. For my layout we use ED DCC-EX mode with a DCC-EX controller (no JMRI in between), maybe this doesn't work in that case? I've certainly had disconnections before - and to my memory the trains have never stopped automatically. I will explicitly test this when I'm next able, to confirm.

configuring ED to stop on phonecall

As @flash62au said, phone calls aren't the only thing that push the app process to the background, to give a few examples I've directly seen happen to family members and visitors to my layout that have resulted in the app "soft-bricking" itself and trains to be stuck running:

  • Accidentally turn the phone screen off for a minute while re-railing a second train.
  • Quickly switch over to a different app to answer a message while their train is on a "safe" bit of track.
  • Receive a Google Meet call - this possibly doesn't/didn't count as a phone call? As the trains did not stop.

I don't expect will solve the disconnect problems. I can only hope to make the reconnection process cleaner.

For sure! The most impactful issue I've seen happen is just the runaway trains:

  1. Train running smoothly, person enjoying.
  2. App temporarily loses focus accidentally or on purpose, Android kills the process
  3. Person sees crash is going to happen, "oh, I better stop it quickly"
  4. App is soft-bricked on the "reconnecting" screen
  5. Person panics, most of the time doesn't even think of restarting the app, trains crash. If they do think of restarting the app, there isn't time for them to click into the menu, click exit, confirm, then find the app, open it again, etc etc.

Overall, even just having the option to stop on app to background would be an improvement (could even be fancy and also make the phone give haptic feedback if backgrounded with an engine running, to catch the person's attention 😆). And then if there were any improvements possible on the re-focus of the app that would be amazing - I almost wonder if there would be a way to get it to throw you back to the connection screen by forcing the Activity to restart or something? I've seen it done with flags before for example:

Intent i = new Intent(this, WrapperActivity.class);
i.setFlags(Intent.FLAG_ACTIVITY_CLEAR_TOP|Intent.FLAG_ACTIVITY_NEW_TASK);
startActivity(i);

Failing that, even showing a screen saying the app needs restarting along with a button to one-click close the app would speed up the process a lot, and make it a lot clearer for those who don't have as much experience with the app.

It's great to hear all the thoughts around this! And it continues to impress me how long ED has been running and maintained.

leohumnew avatar Aug 14 '25 23:08 leohumnew

I'll add the option and the haptic feedback. Those should be easy enough.

At the moment, ED kills itself when it hits ComponentCallbacks2.TRIM_MEMORY_MODERATE I can reasonably assume that was done as a safeguard. I am reluctant to change that to TRIM_MEMORY_RUNNING_COMPLETE as it may already be too late by then to gracefully close out. https://developer.android.com/reference/android/content/ComponentCallbacks2

I can add a warning at TRIM_MEMORY_BACKGROUND, that it is at risk of being killed, though that may give false positives.

Throwing back to the Connection Screen is not trivial as it was to designed to do that. However I have started on the changes to support that, and I already have it 'basically' working.

flash62au avatar Aug 14 '25 23:08 flash62au

Looks like the problem is more complex than I thought. The system is not informing ED that it is low on memory, but is clearly doing something bad to ED that is causing problems.

More experimenting to do...

flash62au avatar Aug 15 '25 00:08 flash62au

I looks pretty bad. Android 16 kills ED shortly after in gets to TRIM_MEMORY_BACKGROUND. Which is the lowest warning level, which is supposed to mean that it is not likely to be killed, but clearly it is. There is no warning, so ED does not get a chance to clean up, so when it restarts it is an odd state.

I can't get it to do this on Android 9.

Best I can hope for is to figure out if the system killed it, then on restart show the connection screen.

flash62au avatar Aug 15 '25 02:08 flash62au

Ugh.  I guess we can't just trigger the existing shutdown process at TRIM_MEMORY_BACKGROUND?

Cleaning up at launch is an interesting alternate approach. Sounds tricky! Save the messaging handles somewhere, use the old handles to tell everything to shutdown at launch (instead of doing it via the lifecycle) and if the old processes didn't get shut down when ED was killed maybe that would get it done? But you probably have something more clever in mind?!

Aug 14, 2025 10:18:26 PM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3190419136]

I looks pretty bad. Android 16 kills ED shortly after in gets to TRIM_MEMORY_BACKGROUND. Which is the lowest warning level, which is supposed to mean that it is not likely to be killed, but clearly it is. There is no warning, so ED does get a chance to clean up, so when it restarts it is an odd state.

I can't get it to do this on Android 9.

Best I can hope for is to figure out if the system killed it, then on restart show the connection screen.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3190419136], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY5SKIVCNPFZKXKHJX33NU7PBAVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJQGQYTSMJTGY]. You are receiving this because you commented.

n3ix avatar Aug 15 '25 13:08 n3ix

well we could trigger at TRIM_MEMORY_BACKGROUND. It may be appropriate for Android 16. No idea about earlier versions.

I wish I had a phone or two with some versions of Android between 10 and 16. I know I can setup VDs, and I may have to, but they are a bit of pain.

Cleaning up at launch... I have no good ideas as of yet.

flash62au avatar Aug 16 '25 02:08 flash62au

Well there is a trap for young players... I very recently had to purchase a new laptop in bit of a hurry. The Snapdragon based laptop I got does not support hypervisor, so I'll have to dig up another PC somewhere to do the VDs.

I realised I have access to an Android 13 tablet. It behaves exactly the same as my Android 16 Phone.

I think I the change to trigger at TRIM_MEMORY_BACKGROUND is starting to make sense.

flash62au avatar Aug 16 '25 05:08 flash62au

I wish I had a phone or two with some versions of Android between 10 and 16. I know I can setup VDs, and I may have to, but they are a bit of pain.

I can test a version on Android 15 if needed at any point!

leohumnew avatar Aug 16 '25 07:08 leohumnew

if you email me I can send you a link to my alpha builds, and the instructions for installing them. akersp62 at gmail.com

flash62au avatar Aug 16 '25 08:08 flash62au

Away from the layout Peter otherwise I'd give you a hand with testing

Aug 16, 2025 1:09:37 AM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3193395048]

Well there is a trap for young players... I very recently had to purchase a new laptop in bit of a hurry. The Snapdragon based laptop I got does not support hypervisor, so I'll have to dig up another PC somewhere to do the VDs.

I realised I have access to an Android 13 tablet. It behaves exactly the same as my Android 16 Phone.

I think I the change to trigger at TRIM_MEMORY_BACKGROUND is starting to make sense.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3193395048], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY7AYMK4DE62CH6Z67L3N24I7AVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJTGM4TKMBUHA]. You are receiving this because you commented.

n3ix avatar Aug 16 '25 11:08 n3ix

I have no idea why, but it is now behaving much better.
It is surviving till at least TRIM_MEMORY_MODERATE. No idea yet of which of my changes did it.

Not out of the woods in any case. It returns to the reconnecting screen which doesn't close by itself.

flash62au avatar Aug 16 '25 12:08 flash62au

Understand that the reconnect is important, but if it is also possible to figure out what change got things back to working until MODERATE that could be very helpful in preventing this from coming back in the future.

Aug 16, 2025 8:34:55 AM Peter Akers @.***>:

 [Image]*flash62au* left a comment (JMRI/EngineDriver#1246)[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3193643330]

I have no idea why, but it is now behaving much better. It is surviving till at least TRIM_MEMORY_MODERATE. No idea yet of which of my changes did it.

Not out of the woods in any case. It returns to the reconnecting screen which doesn't close by itself.

— Reply to this email directly, view it on GitHub[https://github.com/JMRI/EngineDriver/issues/1246#issuecomment-3193643330], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ADMQFY3RBBZGNSDIX3MJIU33N4QO3AVCNFSM6AAAAACD2YSP2GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJTGY2DGMZTGA]. You are receiving this because you commented.

n3ix avatar Aug 16 '25 12:08 n3ix

I spoke too soon. It is working well on Android 16 now. (Other than being sort-of stuck on reconnection page some of the time) However on Android 13 it is as it was. ED is killed without warning sometime after receiving TRIM_MEMORY_BACKGROUND

flash62au avatar Aug 16 '25 21:08 flash62au

I have had a go at implementing ProcessLifecycleOwner to see if provided anything more.

The first problem is that you must add an androidx dependency to use it, and the moment you add an androidx dependency the minSdkVersion must be 21 or higher.

Anyway it didn't help. on Android 13, ED receives no warning that is being killed. And on Android 16 it is also being killed without warning, but much later.

flash62au avatar Aug 16 '25 22:08 flash62au

Anyway it didn't help. on Android 13, ED receives no warning that is being killed. And on Android 16 it is also being killed without warning, but much later.

Interesting 🤔 I've seen ProcessLifecycleOwner advertised as the new "proper" way to do it, so it's for sure weird if it's not working...

It is working well on Android 16 now. However on Android 13 it is as it was.

I'll be at my layout tomorrow and will test what you sent me on Android 15 as well 👍

leohumnew avatar Aug 16 '25 22:08 leohumnew

It is not likely to be helpful anyway. It will only tell you that the app is being killed, which may be too late to do something about it.

onTrimMemory() remains the logical way to do it, as it should be warning you that the app is likely to be killed soon. However the system is not sending the expected events to ED.

flash62au avatar Aug 16 '25 22:08 flash62au

I have put it back and A16 is now working again. I have no idea why A16 is working but A13 is not.

I have made some ugly changes so that it relaunches the connection screen if it thinks the app was killed in background. So at least the UI will be a bit more sensible. I have no idea how to clean up any mess left behind.

flash62au avatar Aug 17 '25 02:08 flash62au