ophyd
ophyd copied to clipboard
Add way to escape 'Another set in progress'.
See #757. This is one possible approach.
TO DO
- [x] Needs broad input about whether this is the right way to address #757
- [ ] Needs tests
- [x] As written this leaves a polling thread dangling. To fix that properly, we'd need
set_and_wait
to accept a poison pillthreading.Event
or some other means of signaling that it should stop waiting.
So far, this PR makes a clean way to dump the thread in the same way we do now. We need to terminate those dangling threads. As is, console session does not exit cleanly, requiring one or more ^C to fully exit the process.
Good point, @prjemian. I took it further in this next commit. I tested against this pathological caproto IOC which takes forever to set:
#!/usr/bin/env python3
from caproto.server import pvproperty, PVGroup, ioc_arg_parser, run
class Stubborn(PVGroup):
"""
When a PV is written to, write the new value into a file as a string.
"""
async def my_write(self, instance, value):
print(f"Ignored request to write {value}")
return 0
A = pvproperty(put=my_write, value=0)
B = pvproperty(put=my_write, value=0)
if __name__ == '__main__':
ioc_options, run_options = ioc_arg_parser(
default_prefix='stubborn:',
desc='Run an IOC with PVs that ignores write requests')
ioc = Stubborn(**ioc_options)
run(ioc.pvdb, **run_options)
Before, console does not exit cleanly:
In [1]: from ophyd import EpicsSignal
In [2]: s = EpicsSignal("stubborn:A", name='s')
In [3]: s.set(1)
Out[3]: Status(obj=EpicsSignal(read_pv='stubborn:A', name='s', timestamp=1598986036.873889, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False), done=False, success=False)
In [4]: _.done
Out[4]: False
In [5]: s.set(2)
<snipped>
RuntimeError: Another set() call is still in progress for s. If this is due to some transient failure, verify that the device is configured the way you expect, and use clear_set() to ignore and abandon the previous set() operation.
In [6]: s.clear_set()
/home/dallan/Repos/bnl/ophyd/ophyd/signal.py:318: UserWarning: A previous set() operaiton is being ignored. Only this do this when debugging or recovering from a hardware failure.
"A previous set() operaiton is being ignored. Only this do this "
In [7]: exit
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 269, in set_thread
rtol=self.rtolerance)
File "/home/dallan/Repos/bnl/ophyd/ophyd/utils/epics_pvs.py", line 263, in set_and_wait
current_value = signal.get()
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 1108, in get
self._read_pv, timeout, connection_timeout, as_string, form)
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 1066, in _get_with_timeout
info = pv.get_with_metadata(as_string=as_string, form=form, timeout=timeout)
File "/home/dallan/.conda/envs/py37/lib/python3.7/site-packages/epics/pv.py", line 46, in wrapped
raise RuntimeError('Expected CA context is unset')
RuntimeError: Expected CA context is unset
After, clear_set()
causes an AbandonedSet
exception to be raised in the in the background thread and waits for the thread to join()
before returning, so now the console exits cleanly.
In [1]: from ophyd import EpicsSignal
In [2]: s = EpicsSignal("stubborn:A", name='s')
In [3]: s.set(1)
Out[3]: Status(obj=EpicsSignal(read_pv='stubborn:A', name='s', value=1, timestamp=1598986283.0511467, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False), done=False, success=False)
In [4]: _.done
Out[4]: False
In [5]: s.set(2)
<snipped>
RuntimeError: Another set() call is still in progress for s. If this is due to some transient failure, verify that the device is configured the way you expect, and use clear_set() to ignore and abandon the previous set() operation.
In [6]: s.clear_set()
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 272, in set_thread
rtol=self.rtolerance, poison_pill=poison_pill)
File "/home/dallan/Repos/bnl/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet
ophyd.utils.epics_pvs.AbandonedSet
/home/dallan/Repos/bnl/ophyd/ophyd/signal.py:328: UserWarning: A previous set() operaiton is being ignored. Only this do this when debugging or recovering from a hardware failure.
"A previous set() operaiton is being ignored. Only this do this "
In [7]: exit # no errors at exit
I think this kind of event wait poison pill approach is great for killing these background set threads. We've run into this a few times at SLAC, but in our case this indicated a problem with either the IOC or the device class- can't quite remember the details.
Do you think signals should automatically call this at python exit? It might smooth out the shutdown of some of the ipython sessions.
We've run into this a few times at SLAC, but in our case this indicated a problem with either the IOC or the device class-
Yes, I think that will be the case in general. My only reservation about adding this is that it might be used routinely, rather than as intended role: allowing you to salvage the ophyd object instance but prompting you to look into the device or IOC issue. But, the current situation, where if you have bad failed set you have to remake your instance or (as is common in practice) restart your whole process, seem worse.
Do you think signals should automatically call this at python exit? It might smooth out the shutdown of some of the ipython sessions.
Great suggestion. Thanks. Will add now....
Will add now....
... realizing that the current implementation will be changed from the current technique of putting black tape over the blinking red light. (That is, it sets the object reference to None
.)
Before, unclean exit because _set_thread
is dangling:
In [1]: from ophyd import EpicsSignal
In [2]: s = EpicsSignal("stubborn:A", name='s')
In [3]: s.set(1)
Out[3]: Status(obj=EpicsSignal(read_pv='stubborn:A', name='s', timestamp=1599074325.10357, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False), done=False, success=False)
In [4]: _.done
Out[4]: False
In [5]: exit
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 272, in set_thread
rtol=self.rtolerance, poison_pill=poison_pill)
File "/home/dallan/Repos/bnl/ophyd/ophyd/utils/epics_pvs.py", line 274, in set_and_wait
current_value = signal.get()
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 1117, in get
self._read_pv, timeout, connection_timeout, as_string, form)
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 1075, in _get_with_timeout
info = pv.get_with_metadata(as_string=as_string, form=form, timeout=timeout)
File "/home/dallan/.conda/envs/py37/lib/python3.7/site-packages/epics/pv.py", line 46, in wrapped
raise RuntimeError('Expected CA context is unset')
RuntimeError: Expected CA context is unset
After, clear_set
called at exit by a weakref.finalizer
gives a clear error message:
In [5]: exit
Traceback (most recent call last):
File "/home/dallan/Repos/bnl/ophyd/ophyd/signal.py", line 274, in set_thread
rtol=self.rtolerance, poison_pill=poison_pill)
File "/home/dallan/Repos/bnl/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1599075154.647881, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
My only reservation about adding this is that it might be used routinely
This is a very fair concern. In the past we've talked a lot about not leaving traps for our users.
I think this PR does a good job of making the failure at least very verbose with the warnings and exceptions raised, hopefully verbose enough that someone investigates the root cause of the issue rather than patching over it indefinitely.
In the context of "make it work for the user experiment" though, I think it's important to provide tools like this, else we end up doing some sort of kluge that makes the experiment work but is otherwise an accrual of technical debt. I'll take a standard bandaid over roll-your-own bandaid any day of the week.
Well said.
@danielballan : Your comment above shows exactly the same sort of error I was seeing recently. In that comment from USAXS, there is an attached text file with the thousands of lines output from the error dump on exiting the console session.
In a loop (with bps.repeat), I ran a plan that included six different bp.scan runs. Four of the scans involve a motor with the .STUP
field as an added signal:
class GuardSlitMotor(EpicsMotor):
status_update = Component(EpicsSignal, ".STUP")
To clear the problems previously reported, I include at the end of a set of scans a call to the yield from bps.abs_set(MOTOR.status_update, 1)
. The EPICS motor record receives that, does the status update in the motor record, and then sets .STUP
back to zero. Ophyd is left waiting on the status_update
signal so the status object for the .set()
is waiting in a thread. To kill this thread, I call MOTOR.status_update._set_thread = None
on each of these (4) motors. After some repetitions of this set of 6 scans, the time between starts of the set becomes noticeably longer as well as start times of each scan become longer, too. Observe in a plot that the longer this problem went, the slower the scan repeated itself.
I'm concerned that the fix of MOTOR.status_update._set_thread = None
is not enough. I assume that what I see shown in the plot (cited in the details) as increasingly longer times to repeat the same plan is due to accumulated resource leakage with these abandoned threads.
@prjemian unless I'm misreading something here, this PR ensures that the set thread will die using a threading.Event
to send it a kill signal, either when requested by the user or upon finalization. So I think this fix as implemented will help with console exit error dump quite a bit.
Right. In the earliest draft of this PR I just dropped the reference to the thread (“put black tape over the red warning light”) to prompt discussion on the question “Do we even want to expose this API?” But in the current draft I first ensure the thread will stop, as @ZLLentz describes, and only then drop the reference.
Ah, I see the additional code now. Should I patch locally and give it a try? Seems I have a stress test demonstrated above.
That would be great!
Ok, NeXus conferences are done, back in the Bluesky saddle again. Tested with caproto IOC per above using this test code:
from ophyd import EpicsSignal
s = EpicsSignal("stubborn:A", name='s')
st = s.set(1)
st.done
st = s.set(2)
st.done
exit
- Verified same result when using ophyd/master branch (1.5.4) that same result was received:
- Using
danielballan:escape-another-set-in-progress
branch, did not get a clean exit:
(bluesky_2020_9) prjemian@poof ~/.../Bluesky/ophyd $ git branch
* escape-another-set-in-progress
master
(bluesky_2020_9) prjemian@poof ~/.../Bluesky/ophyd $ ipython
Python 3.8.2 (default, Mar 26 2020, 15:53:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.16.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from ophyd import EpicsSignal
...:
...: s = EpicsSignal("stubborn:A", name='s')
...: st = s.set(1)
...: st.done
...: st = s.set(2)
...: st.done
...: exit
...:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-1-548872e6ae6b> in <module>
4 st = s.set(1)
5 st.done
----> 6 st = s.set(2)
7 st.done
8 exit
~/Documents/projects/Bluesky/ophyd/ophyd/signal.py in set(self, value, timeout, settle_time)
1706 '''
1707 if not self._put_complete:
-> 1708 return super().set(value, timeout=timeout, settle_time=settle_time)
1709
1710 # using put completion:
~/Documents/projects/Bluesky/ophyd/ophyd/signal.py in set(self, value, timeout, settle_time)
305
306 if self._set_thread is not None:
--> 307 raise RuntimeError(
308 f"Another set() call is still in progress for {self.name}. "
309 "If this is due to some transient failure, verify that the "
RuntimeError: Another set() call is still in progress for s. If this is due to some transient failure, verify that the device is configured the way you expect, and use clear_set() to ignore and abandon the previous set() operation.
In [2]: st.done
...: exit
...:
Out[2]: <IPython.core.autocall.ExitAutocall at 0x7f2acb4969a0>
In [3]: exit
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1603738132.896743, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
(bluesky_2020_9) prjemian@poof ~/.../Bluesky/ophyd $
FYI
(bluesky_2020_9) prjemian@poof ~/.../Bluesky/ophyd $ conda list ophyd
# packages in environment at /home/prjemian/Apps/anaconda/envs/bluesky_2020_9:
#
# Name Version Build Channel
ophyd 1.5.3.post4+g14a4fd4 dev_0 <develop>
But, adding the call to clear_set()
after the failed set()
allows clean exit:
~/Documents/projects/Bluesky/ophyd/ophyd/signal.py in set(self, value, timeout, settle_time)
305
306 if self._set_thread is not None:
--> 307 raise RuntimeError(
308 f"Another set() call is still in progress for {self.name}. "
309 "If this is due to some transient failure, verify that the "
RuntimeError: Another set() call is still in progress for s. If this is due to some transient failure, verify that the device is configured the way you expect, and use clear_set() to ignore and abandon the previous set() operation.
In [2]: s.clear_set()
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1603738591.861509, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py:334: UserWarning: A previous set() operation is being ignored. Only this do this when debugging or recovering from a hardware failure.
warnings.warn(
In [3]: exit
So, this code is noisy (can't suppress the comments) and it goes against the advice provided, BUT it continues to operate past the reported problems:
from ophyd import EpicsSignal
from ophyd.utils.epics_pvs import AbandonedSet
s = EpicsSignal("stubborn:A", name='s')
for cycle in range(3):
st = s.set(1)
print(f"cycle {cycle}, status: {st.done}")
s.clear_set()
print(f"cycle {cycle}, status: {st.done}")
s.clear_set()
exit()
ipython session:
(bluesky_2020_9) prjemian@poof /tmp $ ipython
Python 3.8.2 (default, Mar 26 2020, 15:53:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.16.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from ophyd import EpicsSignal
...: from ophyd.utils.epics_pvs import AbandonedSet
...:
...: s = EpicsSignal("stubborn:A", name='s')
...:
...: for cycle in range(3):
...: st = s.set(1)
...: print(f"cycle {cycle}, status: {st.done}")
...:
...: s.clear_set()
...: print(f"cycle {cycle}, status: {st.done}")
...:
...: s.clear_set()
...:
...: exit()
...:
cycle 0, status: False
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1603747329.535178, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py:334: UserWarning: A previous set() operation is being ignored. Only this do this when debugging or recovering from a hardware failure.
warnings.warn(
cycle 0, status: True
cycle 1, status: False
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1603747329.537657, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
cycle 1, status: False
cycle 2, status: False
set_and_wait(value=1, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='stubborn:A', name='s', value=0, timestamp=1603747329.539203, auto_monitor=False, string=False, write_pv='stubborn:A', limits=False, put_complete=False) was set to 1 but it does not seem to have finished. We are no longer watching for it.
cycle 2, status: False
(bluesky_2020_9) prjemian@poof /tmp $
Now for the stress test.
Code
#!/usr/bin/env python3
from ophyd import EpicsSignal
from ophyd.utils.epics_pvs import AbandonedSet
import time
s = EpicsSignal("sky:m1.STUP", name='s', string=True)
for cycle in range(3):
print(">"*40)
st = s.set("ON")
print(f"cycle {cycle}, status: {st.done}")
time.sleep(0.05)
s.clear_set()
print(f"cycle {cycle}, status: {st.done}")
print("<"*40)
s.clear_set()
exit()
without s.clear_set()
(bluesky_2020_9) prjemian@poof /tmp $ ./test_motor.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cycle 0, status: False
cycle 0, status: False
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Traceback (most recent call last):
File "./test_motor.py", line 11, in <module>
st = s.set("ON")
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 1708, in set
return super().set(value, timeout=timeout, settle_time=settle_time)
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 307, in set
raise RuntimeError(
RuntimeError: Another set() call is still in progress for s. If this is due to some transient failure, verify that the device is configured the way you expect, and use clear_set() to ignore and abandon the previous set() operation.
set_and_wait(value=ON, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='sky:m1.STUP', name='s', value='OFF', timestamp=1603748190.873643, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False) was set to ON but it does not seem to have finished. We are no longer watching for it.
with s.clear_set()
(bluesky_2020_9) prjemian@poof /tmp $ ./test_motor.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cycle 0, status: False
set_and_wait(value=ON, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='sky:m1.STUP', name='s', value='OFF', timestamp=1603748220.059571, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False) was set to ON but it does not seem to have finished. We are no longer watching for it.
/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py:334: UserWarning: A previous set() operation is being ignored. Only this do this when debugging or recovering from a hardware failure.
warnings.warn(
cycle 0, status: False
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cycle 1, status: False
set_and_wait(value=ON, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='sky:m1.STUP', name='s', value='BUSY', timestamp=1603748220.081061, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False) was set to ON but it does not seem to have finished. We are no longer watching for it.
cycle 1, status: False
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cycle 2, status: False
CA.Client.Exception...............................................
Warning: "Channel write request failed"
Context: "op=1, channel=sky:m1.STUP, type=DBR_ENUM, count=1, ctx="sky:m1""
Source File: ../oldChannelNotify.cpp line 158
Current Time: Mon Oct 26 2020 16:37:00.132979121
..................................................................
set_and_wait(value=ON, timeout=None, atol=None, rtol=None)
Traceback (most recent call last):
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/signal.py", line 273, in set_thread
set_and_wait(self, value, timeout=timeout, atol=self.tolerance,
File "/home/prjemian/Documents/projects/Bluesky/ophyd/ophyd/utils/epics_pvs.py", line 271, in set_and_wait
raise AbandonedSet(
ophyd.utils.epics_pvs.AbandonedSet: The signal EpicsSignal(read_pv='sky:m1.STUP', name='s', value='BUSY', timestamp=1603748220.081061, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False) was set to ON but it does not seem to have finished. We are no longer watching for it.
cycle 2, status: True
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Conclusion
The stress test passes with this addition. It is definitely noisy (lots of unpreventable console output). But the primary problem is solved.
Calling set()
with an explicit timeout is less noisy:
In [30]: st = s.set("ON", timeout=0.1)
set_and_wait(value=ON, timeout=0.1, atol=None, rtol=None)
In [31]: st = s.set("ON", timeout=0.1)
set_and_wait(value=ON, timeout=0.1, atol=None, rtol=None)
In [32]: st = s.set("ON", timeout=0.1)
set_and_wait(value=ON, timeout=0.1, atol=None, rtol=None)
In the why didn't I think of that sooner category, isn't setting a timeout parameter the right way? Can the timeout=0.05
kwarg be used with bps.mv
?
def plan9(signal):
for _itr in range(3):
try:
yield from bps.mv(signal, "ON", timeout=0.01)
except FailedStatus:
pass
Yup, it works (if the exception is trapped):
In [13]: RE(plan9(s1))
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
CA.Client.Exception...............................................
Warning: "Channel write request failed"
Context: "op=1, channel=sky:m1.STUP, type=DBR_ENUM, count=1, ctx="sky:m1""
Source File: ../oldChannelNotify.cpp line 158
Current Time: Mon Oct 26 2020 17:22:34.108184434
..................................................................
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
Out[13]: ()
The result can be quieter by suppressing the warning-level messages from the signal:
In [24]: s1.log.setLevel("ERROR")
In [25]: RE(plan9(s1))
CA.Client.Exception...............................................
Warning: "Channel write request failed"
Context: "op=1, channel=sky:m1.STUP, type=DBR_ENUM, count=1, ctx="sky:m1""
Source File: ../oldChannelNotify.cpp line 158
Current Time: Mon Oct 26 2020 17:28:45.578939646
..................................................................
Out[25]: ()
In [26]: s1.log.setLevel("WARNING")
In [27]: RE(plan9(s1))
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
CA.Client.Exception...............................................
Warning: "Channel write request failed"
Context: "op=1, channel=sky:m1.STUP, type=DBR_ENUM, count=1, ctx="sky:m1""
Source File: ../oldChannelNotify.cpp line 158
Current Time: Mon Oct 26 2020 17:28:56.273481707
..................................................................
set_and_wait(value=ON, timeout=0.01, atol=None, rtol=None)
Out[27]: ()
In [28]:
@danielballan @tacaswell
I'm sensing victory here by adding a timeout
kwarg to the bps.mv()
calls for metadata. We can suppress warnings (via signal.log.setLevel()
) if they become a nuisance.
But now I ask the question why the timeout
kwarg in EpicsSignal .set()
method does not default to the .timeout
value? Looking here: https://github.com/bluesky/ophyd/blob/3dddf111946c1eb7200361d2158445db2acfb76d/ophyd/signal.py#L1719
This .set()
will default the timeout
kwarg to None
. Since I've set the default timeout for the signal to be 10 s, I was expecting s.timeout
to be 10 (and it is).
(bluesky_2020_9) prjemian@poof /tmp $ ipython
Python 3.8.2 (default, Mar 26 2020, 15:53:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.16.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from ophyd.signal import EpicsSignalBase
In [2]: EpicsSignalBase.set_default_timeout(timeout=10, connection_timeout=5)
In [3]: from ophyd import EpicsSignal
In [4]: s = EpicsSignal("sky:m1.STUP", name='s', string=True)
In [5]: s.timeout
Out[5]: 10
Now, try st = s.set("ON")
and wait 15 seconds (5 seconds beyond timeout) before printing the status. Expecting it be done.
import time
st = s.set("ON")
time.sleep(15)
print(st)
Expecting done
, but no.
In [6]: import time
...: st = s.set("ON")
...: time.sleep(15)
...: print(st)
...:
Status(obj=EpicsSignal(read_pv='sky:m1.STUP', name='s', value='OFF', timestamp=1603771932.302131, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False), done=False, success=False)
Repeat but use a timeout=10
kwarg with .set()
in a new ipython session:
In [1]: from ophyd import EpicsSignal
...: import time
...: s = EpicsSignal("sky:m1.STUP", name='s', string=True)
...: st = s.set("ON", timeout=10)
...: time.sleep(15)
...: print(st)
...:
set_and_wait(value=ON, timeout=10, atol=None, rtol=None)
Status(obj=EpicsSignal(read_pv='sky:m1.STUP', name='s', value='OFF', timestamp=1603772109.341146, auto_monitor=False, string=True, write_pv='sky:m1.STUP', limits=False, put_complete=False), done=True, success=False)
This seems :+1: to me. It adds a bunch of complexity, but I think that this is the minimum complexity we can have if we are going to have background status threads like this.
This needs a rebase, and tests. (Note to self.) Otherwise it seems there is consensus for merging.
"Another set()
call is still in progress problem" in EpicsSignal.set
function seems to occur especially when it is a transition type signal.
If 1 is set to xxx:m1.STUP
, it changes in the following order: 0(OFF) -> 1 (On) -> 2 (BUSY) -> 0 (OFF)
.
xxx:m1.STUP 2021-01-14 17:10:52.570557 OFF
xxx:m1.STUP 2021-01-14 17:10:52.570557 ON
xxx:m1.STUP 2021-01-14 17:11:01.923591 BUSY
xxx:m1.STUP 2021-01-14 17:11:01.923837 OFF
Another transition type signal may change in the order of 0 -> 1 -> 0
.
test:signal 2021-01-14 16:43:21.773892 0
test:signal 2021-01-14 16:43:23.363370 1
test:signal 2021-01-14 16:43:23.364933 0
In this case, the transition time from 1 to 0 is only 1.56 ms. The while loop in the ophyd.utils.epics_pvs.set_and_wait
function seems difficult to detect this change.
On camonitor
, the set was successful, but the EpicsSignal.set
function only sees 0 values after the transition ends. Therefore, I think this type of EpicsSignal should be handled specially.
Could you please review how this solution is?
https://github.com/physwkim/ophyd/commit/2a3b1c09c62ebf9fa5eb4807ec9e32221276edf6
It would be better to use the put_complete option for the transition signal.
signal = EpicsSignal("test:signal", put_complete=True)
Thanks for pushing on this @prjemian .
I'm stuck trying to trigger AbandonedSet raised in ophyd.utils.epics_pvs.set_and_wait
Ideas?
On Thu, Jun 10, 2021, 12:12 PM Dan Allan @.***> wrote:
Thanks for pushing on this @prjemian https://github.com/prjemian .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/pull/901#issuecomment-858801584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMHKNIOYPOTWTP37XEDTSDW6DANCNFSM4QR7ZBJQ .
I'm confused, .set()
defines the poison_pill
while .put()
does not. Will there be troubles with .put()
?
Why is the CI taking 6 hours to run?
Will there be troubles with
.put()
?
No, this is semi-internal. It operates one layer below Status objects so it should not have anything to do with "poison pill".