packages
packages copied to clipboard
ModemManager: broken in 22.03/23.05 and netifd reconnect issue
Firstly, ModemManager is broken in 22.03 because modem disconnect does not reach through to netifd absent this patch from @aleksander0m: https://github.com/openwrt/packages/commit/bc754f31cfdb004eefa43038f8f0827922107fc6
In short, ModemManager sees the disconnect, but netifd does not and the output of e.g. 'ifstatus' or 'ip addr show dev' is wrong (reflecting the old connected state and not updated to reflect disconnection). See e.g. here: https://github.com/openwrt/openwrt/issues/8368#issuecomment-1299719052.
So ModemManager in 22.03 needs to be updated to fix this.
Secondly, even with the above patch in the ModemManager version on OpenWrt master, upon disconnection we see:
Wed Nov 2 20:38:54 2022 daemon.info [2716]: <info> [modem0] state changed (connected -> disconnecting)
Wed Nov 2 20:38:54 2022 daemon.info [2716]: <info> [modem0] state changed (disconnecting -> registered)
Wed Nov 2 20:38:54 2022 daemon.info [2716]: <info> [modem0/bearer3] connection #1 finished: duration 121s, tx: 258333 bytes, rx: 371928 bytes
Wed Nov 2 20:38:54 2022 user.notice modemmanager: interface wan (network device wwan0) disconnected
Wed Nov 2 20:38:54 2022 daemon.notice netifd: Interface 'wan' has lost the connection
Wed Nov 2 20:38:54 2022 daemon.notice netifd: Network device 'wwan0' link is down
Wed Nov 2 20:38:54 2022 daemon.warn dnsmasq[1]: no servers found in /tmp/resolv.conf.d/resolv.conf.auto, will retry
And so although now with the patched version of ModemManager netifd sees the disconnect, netifd does not actually reconnect.
This is using /etc/config/network:
config interface 'wan'
option proto 'modemmanager'
option device '/sys/devices/platform/1e1c0000.xhci/usb2/2-1'
option auth 'pap'
option iptype 'ipv4v6'
option apn 'xx
option username 'yy'
option password 'zz'
I am also facing this issue. Currently using ModemManager 1.18.12-5 in OpenWRT 21.02.
ModemManager netifd notifies that the interface is down but netifd does not reconnect ModemManager by itself.
@aleksander0m I am able to reproduce this problem consistently as I have a private LTE network in my lab where my device with OpenWRT installed is connected to.
Indeed at the moment in OpenWrt both 'qmi' and 'modemmanager' netifd implementations are broken in that they don't handle disconnections properly - the user has to manually reconnect.
I just had a big chat with @jow- about this on #openwrt-devel and it turns out that contrary to @aleksander0m's assumption about netifd automatically reconnecting, reporting disconnect to netifd is not enough and netifd will not automatically reconnect. It seems that rather ModemManager should reconnect and then a corresponding 'report-up' script ought to be written to inform netifd of the new connection details - see this post here for details:
https://forum.openwrt.org/t/rfc-wwan-uqmi-revamp/141506/16?u=lynx
We need fix to ModemManager to address this, namely ModemManager should:
- inform netifd as it presently does in patched version using the report-down script
- reconnect
- inform netifd with new report-up script of the new connection details
And also it looks from the OpenWrt thread linked above like the qmi protocol is simultaneously going to get some attention in this respect.
So hopefully in the end OpenWrt users will have at least one protocol to choose from, be it 'modemmanager' or 'qmi' once they are fixed to handle reconnections.
BTW @desolated40 what's your present fix to address this? Can we set up a hotplug event on ifdown that just calls ifup?
@lynxthecat Thanks for the info! I'm currently using basic serial (PPP) but looking to implement ModemManager as this is the only protocol right now supported by OpenWisp.
We need fix to ModemManager to address this, namely ModemManager should:
inform netifd as it presently does in patched version using the report-down script reconnect inform netifd with new report-up script of the new connection details
MM is not in charge of any reconnection logic, MM does not need to be fixed here. MM only receives requests to connect/disconnect from upper layers (netifd) and monitors the ongoing connection and notifies to upper layers (netifd) of network-initiated disconnections. It should be netifd the one managing the connection, so the autoreconnection logic should be handled by netifd.
The way to solve this, after briefly talking with @jow- about this in IRC would be to have the netifd protocol handler launch a "watcher" process which brings up the connection, and is kept alive and running for as long as the network interface is assumed connected. If MM detects a network-initiated disconnection, the dispatcher script called by MM should kill that watcher process. At that point, netifd (if configured to autoconnect) will then kill the netdev, restart that process and await proto updates.
Anyone up to the task of writing this logic in the netfid modemmanager protocol handler?
I thought the whole point of ModemManager was that it's a daemon to manage connections and keep them alive. If not what's the point?
I thought the whole point of ModemManager was that it's a daemon to manage connections and keep them alive. If not what's the point
MM provides a unified API to control modems, so that the user does not need to care of which is the underlying control protocol the modem uses. MM doesn't even configure the network interfaces, for what it's worth. MM only talks to the modem control port.
Ah, sorry, and understood. I am speaking out of ignorance, and I feel like an elephant in a china shop now throwing my thoughts around without properly understanding things. Please forgive my impertinence.
I would try to write this protocol handler if I could, but I lack the skills to do so.
@desolated40 it seems to me like a reasonable temporary fix is to replace the existing '10-report-down' script with something like the following:
root@OpenWrt:~# cat /usr/lib/ModemManager/connection.d/10-report-down-and-reconnect
#!/bin/sh
# Automatically report to netifd that the underlying modem
# is really disconnected and reconnect if interface was up
# require program name and at least 4 arguments
[ $# -lt 4 ] && exit 1
MODEM_PATH="$1"
BEARER_PATH="$2"
INTERFACE="$3"
STATE="$4"
[ "${STATE}" = "disconnected" ] || exit 0
. /usr/share/ModemManager/modemmanager.common
. /lib/netifd/netifd-proto.sh
INCLUDE_ONLY=1 . /lib/netifd/proto/modemmanager.sh
MODEM_STATUS=$(mmcli --modem="${MODEM_PATH}" --output-keyvalue)
[ -n "${MODEM_STATUS}" ] || exit 1
MODEM_DEVICE=$(modemmanager_get_field "${MODEM_STATUS}" "modem.generic.device")
[ -n "${MODEM_DEVICE}" ] || exit 2
CFG=$(mm_get_modem_config "${MODEM_DEVICE}")
[ -n "${CFG}" ] || exit 3
IFUP=$(ifstatus ${CFG} | jsonfilter -e '@.up')
logger -t "modemmanager" "interface ${CFG} (network device ${INTERFACE}) ${STATE}"
proto_init_update $INTERFACE 0
proto_send_update $CFG
[ "${IFUP}" = "true" ] && ifup ${CFG}
exit 0
Works for me in terms of immediately reconnecting post disconnect.
Thanks @lynxthecat this indeed works like you are describing in the happy flow. However there are cases where the LTE network is not in a ready state to connect to directly after a disconnection.
I have tried to modify the script but I don't have the knowhow of modifying this script so netifd will keep trying to reconnect and stops when the connection has been restored.
I would first try just adding 'sleep X' right before this line
[ "${IFUP}" = "true" ] && ifup ${CFG}
where X is 2, 5 or 10 or whatever works.
If you want repeated retries it may be better to revert to the original 10-report-down script and then try setting up an external init.d script that tries to keep wan up with something like:
#!/bin/sh
while true
do
IFUP=$(ifstatus wan | jsonfilter -e '@.up')
if [ "${IFUP}" = "false" ]; then
logger -t "wan-watchdog" "interface wan down, trying to bring it back up now"
ifup wan
sleep 10
fi
done
Or better, if you can setup 'ip monitor' (I think this requires the ip-full package in OpenWrt) then you could take inspiration from this:
https://github.com/lynxthecat/maintain-wan-lease/blob/main/maintain-wan-lease
And react to the disconnection as it happens rather than polling every X seconds. So something like:
#!/bin/sh
ip monitor link dev wwan0 | while read event; do
case $event in
*'NO-CARRIER'* )
logger -t "wan-watchdog" "interface wan down, trying to bring it back up now"
while [ $(ifstatus wan | jsonfilter -e '@.up') = 'false' ]
do
ifup wan
sleep 5
done
;;
esac
done
Please let me know how you get on.
Keep in mind that these are just temporary hacks whilst we wait for a proper fix - either to the OpenWrt ModeManager netifd implementation as described by @aleksander0m above, or to the OpenWrt qmi netifd implementation (see here).
Unfortunately it is clear that the OpenWrt ModemManager implementation is broken at the moment, but at least the main developers now have a common understanding since until very recently:
- netifd expected ModemManager to initiate reconnection; and
- ModemManager expected netifd to intitiatereconnection,
and hence the stalemate and a lot of frustrated users writing their own DIY reconnection scripts with things like rebooting the device or restarting the connection upon failed ICMPs.
@jow- has since been convinced that the netifd ModemManager protocol needs modifying as per:
The way to solve this, after briefly talking with @jow- about this in IRC would be to have the netifd protocol handler launch a "watcher" process which brings up the connection, and is kept alive and running for as long as the network interface is assumed connected. If MM detects a network-initiated disconnection, the dispatcher script called by MM should kill that watcher process. At that point, netifd (if configured to autoconnect) will then kill the netdev, restart that process and await proto updates.
See above: https://github.com/openwrt/packages/issues/19794#issuecomment-1303601631
I very much hope that the necessary modifications to OpenWrt will get made in the not too distant future to improve wireless wan handling in OpenWrt, especially now that the issue is better understood.
I am also facing this issue. Currently using ModemManager 1.18.12-5 in OpenWRT 21.02.
ModemManager netifd notifies that the interface is down but netifd does not reconnect ModemManager by itself.
@aleksander0m I am able to reproduce this problem consistently as I have a private LTE network in my lab where my device with OpenWRT installed is connected to.
@desolated40 this PR backported a number of fixes to OpenWrt 21.02: https://github.com/openwrt/packages/pull/19648.
Which PR? As far as I'm aware, LTE reconnection with respect to ModemManager is still broken in 22.03. In fact last time I checked not even the new modem manager version with the dispatcher script functionality is in 22.03. I'm hopeful that the ModemManager version will get updated and the necessary steps outlined above will ultimately get implemented to fix this issue in 22.03.
Please, please can ModemManager be bumped to at least 1.18.12 for 22.03 so that at least users can use my temporary workaround for reconnects?
@lynxthecat I was talking about https://github.com/openwrt/packages/pull/19648.
@nemesisdesign who should we ping to try to move the needle on this one?
One more major issue is the package umbim does not support proxy mode which is required by lot of lte/5G devices.
There's a commit in the umbim repo , however that feature does not exist in openwrt 22.03.5
https://git.openwrt.org/?p=project/umbim.git;a=commit;h=ff8d35615153086f4b89443e511907f10ff059de
One more major issue is the package umbim does not support proxy mode which is required by lot of lte/5G devices.
Please note the support for using the mbim-proxy
is not related in any way to a requirement from the LTE/5G device. That commit adding support in umbim
to use the mbim-proxy
is so that you can run umbim
commands while ModemManager
is managing the device at the same time.
Has someone tried to work on the watcher process logic as suggested in https://github.com/openwrt/packages/issues/19794#issuecomment-1303601631?
A year later, in the latest openwrt 23.05 and main line tasks. There is still no change in the Modemmanager, and I cannot reconnect myself after disconnecting the link. You must manually click Reconnect on the network port interface to work properly. Is anyone willing to think of a solution?
@feckert or @osedl, might either of you be interested in trying out an implementation for:
https://github.com/openwrt/packages/issues/19794#issuecomment-1576338502
or an alternative technique for handling reconnection?
At present, say on an ISP-initiated disconnect (not uncommon for several providers that disconnect users every 24/48 hours or otherwise), modemmanager will report to netifd and bring the interface down, but then nothing happens and internet connectivity is lost absent manual user intervention.
But naturally most users will want the interface to be brought back up on such occasions with as little interruption to connectivity as possible.
I have found an implementation from @dangowrt about this issue. See https://github.com/openwrt/packages/pull/20760/commits/c646b6f34b43619811597f635cc657dbb26ecf33 this may could fix this issue.
But since this is a source change in the ModemManager we need to get this upstream to @aleksander0m ModemManager repository first, so we do not have an out of tree fix for this issue.
@desolated40 it seems to me like a reasonable temporary fix is to replace the existing '10-report-down' script with something like the following:
root@OpenWrt:~# cat /usr/lib/ModemManager/connection.d/10-report-down-and-reconnect #!/bin/sh # Automatically report to netifd that the underlying modem # is really disconnected and reconnect if interface was up # require program name and at least 4 arguments [ $# -lt 4 ] && exit 1 MODEM_PATH="$1" BEARER_PATH="$2" INTERFACE="$3" STATE="$4" [ "${STATE}" = "disconnected" ] || exit 0 . /usr/share/ModemManager/modemmanager.common . /lib/netifd/netifd-proto.sh INCLUDE_ONLY=1 . /lib/netifd/proto/modemmanager.sh MODEM_STATUS=$(mmcli --modem="${MODEM_PATH}" --output-keyvalue) [ -n "${MODEM_STATUS}" ] || exit 1 MODEM_DEVICE=$(modemmanager_get_field "${MODEM_STATUS}" "modem.generic.device") [ -n "${MODEM_DEVICE}" ] || exit 2 CFG=$(mm_get_modem_config "${MODEM_DEVICE}") [ -n "${CFG}" ] || exit 3 IFUP=$(ifstatus ${CFG} | jsonfilter -e '@.up') logger -t "modemmanager" "interface ${CFG} (network device ${INTERFACE}) ${STATE}" proto_init_update $INTERFACE 0 proto_send_update $CFG [ "${IFUP}" = "true" ] && ifup ${CFG} exit 0
Works for me in terms of immediately reconnecting post disconnect.
This does not work for me.
@maskimthedog you probably ought to elaborate on this if you can. What do you see in the system log?
We are using an MT7688 with BG96 over QMI, OpenWRT with Modem Manager 1.20.6 with the 10-report-down-and-reconnect script. Below quote is from my colleague performing signal degradation and modem recovery tests.
"when I add too much attenuation or disconnect the antenna completely, modem losts the connections after installing back the antenna, modem quite quickly backs to good CSQ and state "registered", but openWrt cannot recover internet connection after that.. watchcat need to trigger restart (unless I send "ifup wan") "
In what logs are you interested @lynxthecat?
The entries in 'logread' around the disconnection/reconnection attempt events.
@lynxthecat my colleague is repeating his test to get one of our routers in to this state and dumping the log. Will provide this afternoon. Watchcat rebooted the router on the last iteration...we disabled it for this one.
Good. This is a hack (hence the outstanding issue), but I think it is preferable (and should replace, not complement) watchcat. I mean polling and sending out pings, and reconnecting on loss of ping responses, seems to me less desirable than acting upon disconnection events intercepted by ModemManager.
Looks like mwan3 could be interfering?