POCS
POCS copied to clipboard
Poor handling of serial I/O failure with Astrohaven dome
pocs_shell output:
POCS > run_pocs
Starting messaging
Command publisher started on port 6500
Command subscriber started on port 6501
Message subscriber started on port 6511
Message publisher started on port 6510
Starting POCS - Press Ctrl-c to interrupt
POCS stopped.
POCS > open_dome
Problem opening the dome: read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
POCS > exit
Shutting down POCS instance, please wait
Traceback (most recent call last):
File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/site-packages/serial/serialposix.py", line 501, in read
'device reports readiness to read but returned no data '
serial.serialutil.SerialException: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "bin/pocs_shell", line 899, in <module>
PocsShell().cmdloop()
File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/cmd.py", line 138, in cmdloop
stop = self.onecmd(line)
File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/cmd.py", line 217, in onecmd
return func(arg)
File "bin/pocs_shell", line 256, in do_exit
self.do_power_down()
File "bin/pocs_shell", line 332, in do_power_down
self.pocs.power_down()
File "/var/panoptes/POCS/pocs/core.py", line 227, in power_down
if not self.observatory.close_dome():
File "/var/panoptes/POCS/pocs/observatory.py", line 535, in close_dome
if not self.dome.is_closed:
File "/var/panoptes/POCS/pocs/dome/astrohaven.py", line 86, in is_closed
v = self._read_latest_state()
File "/var/panoptes/POCS/pocs/dome/astrohaven.py", line 128, in _read_latest_state
data = self.serial.read_bytes(size=1)
File "/var/panoptes/POCS/pocs/utils/rs232.py", line 189, in read_bytes
return self.ser.read(size=size)
File "/home/panoptes/anaconda3/envs/panoptes-env/lib/python3.6/site-packages/serial/serialposix.py", line 509, in read
raise SerialException('read failed: {}'.format(e))
serial.serialutil.SerialException: read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
pocs_shell-all.log:
D0226 23:23:03.693 machine.py:209 Checking safety for get_ready
D0226 23:23:03.693 machine.py:214 Always safe to move to get_ready
D0226 23:23:03.693 machine.py:253 Before calling get_ready from sleeping state
I0226 23:23:10.865 messaging.py:134 PANCHAT Ok, I'm all set up and ready to go!
D0226 23:23:10.865 abstract_serial_dome.py:70 Already connected to dome
I0226 23:23:13.866 observatory.py:522 Opening dome
W0226 23:23:13.974 machine.py:134 Problem going from ready to ready, exiting loop [SerialException('read failed: device reports readiness to read but returned no data (device disconnected or multiple access on port?)',)]
I0226 23:23:13.979 machine.py:181 Stopping POCS states
D0226 23:40:17.235 core.py:351 Checking weather safety
D0226 23:40:17.240 core.py:371 Weather Safety: True [26 sec old - 2018-02-26 23:39:50.972000]
D0226 23:40:17.240 abstract_serial_dome.py:70 Already connected to dome
I0226 23:40:45.160 messaging.py:134 PANCHAT I'm powering down
I0226 23:40:45.160 core.py:225 Shutting down POCS State Machine: , please be patient and allow for exit.
D0226 23:40:45.161 abstract_serial_dome.py:70 Already connected to dome
FYI, the USB serial adapter was present in Linux, but no I/O was possible. I suspect that I needed the ability to reset the specific USB device, but didn't have a program at hand for that (I now do). So, I rebooted the computer, after which POCS started up automatically and was able to open the dome.
We MAY want to add the ability to detect when a failing serial device is a USB device, and if so disconnect, reset the underlying device, and reconnect. I suppose that dmesg
might be used to look for errors.
What did you use to reset the USB device? I had a little utility (I think usbreset.c
) that I had used but it didn't seem to always do what I wanted. This was on PAN001.
I checked on PAN006 this morning, and found that it had left the clamshell dome half open (i.e. one of the two shutters was open). The log showed that it was encountering an error while trying to read the feedback from the dome controller (i.e. getting an empty array back instead of an array with a byte in it). I hit Ctrl-C in pocs_shell to break out of the loop, then entered the close_dome
command. The dome didn't close. I then entered the exit
command, at which point it did close the dome. Hmm.
It isn't clear from reading that logs that the problem this time was the same as previously reported; instead it appears that the controller is not reporting back the values that I was anticipating, or at least the timing is wrong. I may need to add the ability to record the responses and their timings in order to be able to properly debug this.
Update: I'm now thinking that this problem is due to a problem with the shutter B position sensor(s). It appears that it is not reporting when the shutter has closed, so we keep sending the signal to close until the timeout is reached (thank goodness for that). I chatted with Bob Moore who works with the same type of Astrohaven 7' dome, and said they had the same problem because the sensor is secured with double sided tape that fails.
Closing as stale. Re-open as needed.