POCS icon indicating copy to clipboard operation
POCS copied to clipboard

Poor handling of serial I/O failure with Astrohaven dome

Open jamessynge opened this issue 7 years ago • 4 comments

pocs_shell output:

POCS > run_pocs
Starting messaging
Command publisher started on port 6500
Command subscriber started on port 6501
Message subscriber started on port 6511
Message publisher started on port 6510
Starting POCS - Press Ctrl-c to interrupt
POCS stopped.
POCS > open_dome
Problem opening the dome: read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
POCS > exit
Shutting down POCS instance, please wait
Traceback (most recent call last):
  File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/site-packages/serial/serialposix.py", line 501, in read
    'device reports readiness to read but returned no data '
serial.serialutil.SerialException: device reports readiness to read but returned no data (device disconnected or multiple access on port?)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bin/pocs_shell", line 899, in <module>
    PocsShell().cmdloop()
  File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/cmd.py", line 138, in cmdloop
    stop = self.onecmd(line)
  File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/cmd.py", line 217, in onecmd
    return func(arg)
  File "bin/pocs_shell", line 256, in do_exit
    self.do_power_down()
  File "bin/pocs_shell", line 332, in do_power_down
    self.pocs.power_down()
  File "/var/panoptes/POCS/pocs/core.py", line 227, in power_down
    if not self.observatory.close_dome():
  File "/var/panoptes/POCS/pocs/observatory.py", line 535, in close_dome
    if not self.dome.is_closed:
  File "/var/panoptes/POCS/pocs/dome/astrohaven.py", line 86, in is_closed
    v = self._read_latest_state()
  File "/var/panoptes/POCS/pocs/dome/astrohaven.py", line 128, in _read_latest_state
    data = self.serial.read_bytes(size=1)
  File "/var/panoptes/POCS/pocs/utils/rs232.py", line 189, in read_bytes
    return self.ser.read(size=size)
  File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/site-packages/serial/serialposix.py", line 509, in read
    raise SerialException('read failed: {}'.format(e))
serial.serialutil.SerialException: read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)

pocs_shell-all.log:

D0226 23:23:03.693 machine.py:209       Checking safety for get_ready
D0226 23:23:03.693 machine.py:214       Always safe to move to get_ready
D0226 23:23:03.693 machine.py:253       Before calling get_ready from sleeping state
I0226 23:23:10.865 messaging.py:134     PANCHAT Ok, I'm all set up and ready to go!
D0226 23:23:10.865 abstract_serial_dome.py:70 Already connected to dome
I0226 23:23:13.866 observatory.py:522   Opening dome
W0226 23:23:13.974 machine.py:134       Problem going from ready to ready, exiting loop [SerialException('read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)',)]
I0226 23:23:13.979 machine.py:181       Stopping POCS states
D0226 23:40:17.235 core.py:351          Checking weather safety
D0226 23:40:17.240 core.py:371          Weather Safety: True [26 sec old - 2018-02-26 23:39:50.972000]
D0226 23:40:17.240 abstract_serial_dome.py:70 Already connected to dome
I0226 23:40:45.160 messaging.py:134     PANCHAT I'm powering down
I0226 23:40:45.160 core.py:225          Shutting down POCS State Machine: , please be patient and allow for exit.
D0226 23:40:45.161 abstract_serial_dome.py:70 Already connected to dome

jamessynge avatar Feb 26 '18 23:02 jamessynge

FYI, the USB serial adapter was present in Linux, but no I/O was possible. I suspect that I needed the ability to reset the specific USB device, but didn't have a program at hand for that (I now do). So, I rebooted the computer, after which POCS started up automatically and was able to open the dome.

We MAY want to add the ability to detect when a failing serial device is a USB device, and if so disconnect, reset the underlying device, and reconnect. I suppose that dmesg might be used to look for errors.

jamessynge avatar Feb 28 '18 21:02 jamessynge

What did you use to reset the USB device? I had a little utility (I think usbreset.c) that I had used but it didn't seem to always do what I wanted. This was on PAN001.

wtgee avatar Feb 28 '18 21:02 wtgee

I checked on PAN006 this morning, and found that it had left the clamshell dome half open (i.e. one of the two shutters was open). The log showed that it was encountering an error while trying to read the feedback from the dome controller (i.e. getting an empty array back instead of an array with a byte in it). I hit Ctrl-C in pocs_shell to break out of the loop, then entered the close_dome command. The dome didn't close. I then entered the exit command, at which point it did close the dome. Hmm.

It isn't clear from reading that logs that the problem this time was the same as previously reported; instead it appears that the controller is not reporting back the values that I was anticipating, or at least the timing is wrong. I may need to add the ability to record the responses and their timings in order to be able to properly debug this.

jamessynge avatar Mar 31 '18 13:03 jamessynge

Update: I'm now thinking that this problem is due to a problem with the shutter B position sensor(s). It appears that it is not reporting when the shutter has closed, so we keep sending the signal to close until the timeout is reached (thank goodness for that). I chatted with Bob Moore who works with the same type of Astrohaven 7' dome, and said they had the same problem because the sensor is secured with double sided tape that fails.

jamessynge avatar Apr 08 '18 18:04 jamessynge

Closing as stale. Re-open as needed.

wtgee avatar Mar 28 '24 23:03 wtgee