mmeowlink icon indicating copy to clipboard operation
mmeowlink copied to clipboard

Problem with hanging reset.py and stacktrace with mmeowlink-any-pump-comms.py

Open PieterGit opened this issue 7 years ago • 6 comments

In some cases (don't know when or what happens) reset.py hangs.

glucose.json newer than pumphistory: 
Error, retrying
Listening: ........................................................................................................................................................................................................Starting pump-loop at Wed Apr 19 15:19:00 CEST 2017:
Traceback (most recent call last):
  File "/usr/local/bin/mmeowlink-any-pump-comms.py", line 15, in <module>
    app.run(None)
  File "/usr/local/lib/python2.7/dist-packages/decocare/helpers/cli.py", line 113, in run
    self.prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/any_pump_comms_app.py", line 28, in prelude
    super(AnyPumpCommsApp, self).prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/base_mmeowlink_app.py", line 26, in prelude
    self.link = link = LinkBuilder().build(args.radio_type, port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/link_builder.py", line 16, in build
    return SubgRfspyLink(port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 55, in __init__
    self.open()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_interface.py", line 28, in open
    self.check_setup()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 72, in check_setup
    self.serial_rf_spy.sync()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_rf_spy.py", line 121, in sync
    raise CommsException("Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?")
mmeowlink.exceptions.CommsException: Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?
+ echo

+ echo The CC111x is located at /dev/spidev5.1
The CC111x is located at /dev/spidev5.1
+ cd /root/src/subg_rfspy/tools
+ case "$2" in
+ ./reset.py /dev/spidev5.1
2017-04-19 15:19:34,063 ERROR TimeoutExpired. Killing process
Starting pump-loop at Wed Apr 19 15:25:12 CEST 2017:
Traceback (most recent call last):
  File "/usr/local/bin/mmeowlink-any-pump-comms.py", line 15, in <module>
    app.run(None)
  File "/usr/local/lib/python2.7/dist-packages/decocare/helpers/cli.py", line 113, in run
    self.prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/any_pump_comms_app.py", line 28, in prelude
    super(AnyPumpCommsApp, self).prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/base_mmeowlink_app.py", line 26, in prelude
    self.link = link = LinkBuilder().build(args.radio_type, port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/link_builder.py", line 16, in build
    return SubgRfspyLink(port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 55, in __init__
    self.open()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_interface.py", line 28, in open
    self.check_setup()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 72, in check_setup
    self.serial_rf_spy.sync()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_rf_spy.py", line 121, in sync
    raise CommsException("Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?")
mmeowlink.exceptions.CommsException: Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?
+ echo

+ echo The CC111x is located at /dev/spidev5.1
The CC111x is located at /dev/spidev5.1
+ cd /root/src/subg_rfspy/tools
+ case "$2" in
+ ./reset.py /dev/spidev5.1
2017-04-19 15:25:45,870 ERROR TimeoutExpired. Killing process
retry 0 
... retry 0 is repeated more than 450 times (until killall kicks in)
retry 0 
Starting pump-loop at Wed Apr 19 15:40:17 CEST 2017:
Traceback (most recent call last):
  File "/usr/local/bin/mmeowlink-any-pump-comms.py", line 15, in <module>
    app.run(None)
  File "/usr/local/lib/python2.7/dist-packages/decocare/helpers/cli.py", line 113, in run
    self.prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/any_pump_comms_app.py", line 28, in prelude
    super(AnyPumpCommsApp, self).prelude(args)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/cli/base_mmeowlink_app.py", line 26, in prelude
    self.link = link = LinkBuilder().build(args.radio_type, port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/link_builder.py", line 16, in build
    return SubgRfspyLink(port)
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 55, in __init__
    self.open()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_interface.py", line 28, in open
    self.check_setup()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/subg_rfspy_link.py", line 72, in check_setup
    self.serial_rf_spy.sync()
  File "/usr/local/lib/python2.7/dist-packages/mmeowlink/vendors/serial_rf_spy.py", line 121, in sync
    raise CommsException("Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?")
mmeowlink.exceptions.CommsException: Could not get subg_rfspy state or version. Have you got the right port/device and radio_type?
+ echo

+ echo The CC111x is located at /dev/spidev5.1
The CC111x is located at /dev/spidev5.1
+ cd /root/src/subg_rfspy/tools
+ case "$2" in
+ ./reset.py /dev/spidev5.1
2017-04-19 15:40:51,314 ERROR TimeoutExpired. Killing process
retry 0 

PieterGit avatar Apr 19 '17 19:04 PieterGit

Happens with WW pump and Explorer board with oref0 dev. It seems that the rig-pump communication does not work, causing reset.py to hang. Rebooting the system seems to solve the issue.

@oskarpearson : can you explain what's happening with the first mmeowlink-any-pump-comms.py and why reset.py does not finishes by it selves?

PieterGit avatar Apr 19 '17 19:04 PieterGit

Might this be the same issue that we worked around with https://github.com/openaps/oref0/pull/411 for non-WW pumps?

scottleibrand avatar Apr 19 '17 19:04 scottleibrand

I probably fixed the bug that oref0_subg_ww_radio_parameters.py did not kill the reset.py. Still I rather would like not to need to kill processes if it can be fixed within mmeowlink somehow. See https://github.com/openaps/oref0/pull/445 for a workaround (and fix for killing a hanging process during ww initialization)

PieterGit avatar Apr 19 '17 22:04 PieterGit

@scottleibrand Yes, this might be same issue as https://github.com/openaps/oref0/pull/411 and even the TI USB problems I had with a WW-pump on a RPI3 seem related (both are hanging on reset.py). On a PI3 the --ww_ti_usb_reset (which resets the USB system on a PI) also was a workaround to this issue.

PieterGit avatar Apr 19 '17 23:04 PieterGit

@oskarpearson (or others) can you help me and explain how I can debug this mmeowlink issue? I would like to remove the root issue, so that reseting the USB subsystem or even rebooting the complete rig becomes unnecessary.

PieterGit avatar May 02 '17 22:05 PieterGit

Currently on dev there are pump-rig communication errors that don't go away after a while. @kdsimone wrote a workaround script, see https://gitter.im/nightscout/intend-to-bolus?at=5947348502c480e672598dd9

I think we must try to make mmeowlink and the pump loop more robust for such corner cases, but I need help in debugging the inner workings of mmeowlink.

@oskarpearson can you help look into these rig-pump communication problems?

PieterGit avatar Jun 19 '17 07:06 PieterGit