ophyd icon indicating copy to clipboard operation
ophyd copied to clipboard

Define last_status_object property in Signal class

Open prjemian opened this issue 3 years ago • 15 comments

FIX #993 by adding a last_status_object property (to provide access to the internal _status attribute, if it exists).

prjemian avatar Jun 04 '21 02:06 prjemian

APS USAXS is interested in this PR.

prjemian avatar Jun 04 '21 02:06 prjemian

It's a "semi-private" attribute that isn't really intended to be accessed external to the signal.

Is there a way your external code could be restructured to instead use the return value of set()?

klauer avatar Jun 04 '21 16:06 klauer

Another call to set() in progress is the underlying issue, so no, unless there is a way to resolve that problem by other means. This has been the recommended path so far.

Today, with the USAXS instrument, we discovered that a default of Status() is better than None, and then call .set_finished() so it appears to have been successful.

The alternative is that we define this immediately after defining any Signal (or subclass) objects where we expect to find _status.

On 6/4/2021 11:06 AM, Ken Lauer wrote:

It's a "semi-private" attribute that isn't /really/ intended to be accessed external to the signal.

Is there a way your external code could be restructured to instead use the return value of |set()|?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/pull/994#issuecomment-854839736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMGMYBU4LZE55BQNJ2TTRD2WTANCNFSM46B4KAFQ.

prjemian avatar Jun 04 '21 16:06 prjemian

Some related thoughts (no real opinion on the PR itself):

I think I'd find a public api to get the "most recent status" returned by a settable device useful.

Does calling set_finished on ._status "unstick" a device that is stuck with "Another call to set() in progress", including cleaning up the thread? That's a useful trick to know if so. Maybe there should also be a public API for cancelling a move? Personally I've switched to always defining short timeouts instead, but I understand this isn't possible for all use cases.

ZLLentz avatar Jun 04 '21 16:06 ZLLentz

In this case, we are not certain of the underlying cause as it only seems to present the Another call to set() in progress exception after a long sequence of repeated scans; otherwise identical scans at that. (I'm suspicious of some unreported resource consumption.) I do not believe that calling the set_finished() method of a Status() object will clear the timeout after the timeout has elapsed. With research, I can dig up the issues where we explored the timeout situation here in ophyd related to Another call to set() in progress.

While it is a "private" attribute, here is an example of beam time lost since it was defined only because set() was called. What would be the intent of making this status object an attribute of the class: https://github.com/bluesky/ophyd/blob/89baea23125939b55cf20046fd5337fcb3b73fbd/ophyd/signal.py#L324 if such external access was not intended? The caller receives this object: https://github.com/bluesky/ophyd/blob/89baea23125939b55cf20046fd5337fcb3b73fbd/ophyd/signal.py#L328

prjemian avatar Jun 04 '21 18:06 prjemian

@ZLLentz : In this case, the use is to watch for completion of a fly scan operated by an EPICS database. self.flying is the ophyd.Signal, set to True when pushing the "Go" button (PV) and then set to to False when EPICS says the fly scan is complete.

The fly scan takes a user-selectable time (60-240 s) and is allowed an additional 100s if a new trajectory needs to be loaded from EPICS to the motor controller (which is the low-level controller conducting the fly scan).

prjemian avatar Jun 04 '21 18:06 prjemian

There is a significant difference between making something part of the public API in terms of attribute names, and returning the object itself in a method. Lost beam time is also an entirely different story.

._status should not be relied upon externally. Adding a get_last_status_object() or a .last_status_object property to the supported API would be a better start to such a discussion.

In your current case, the burden is on the caller to ensure that the internals are as expected. That is to say, your code should just read getattr(obj, "_status", None).

Maybe there should also be a public API for cancelling a move? Personally I've switched to always defining short timeouts instead, but I understand this isn't possible for all use cases.

I think the original intention was .stop() could be used even in non-positioners. Perhaps we should be implementing it here.

klauer avatar Jun 04 '21 18:06 klauer

I'm good with exposing this via a public property. I'll modify the PR.

prjemian avatar Jun 04 '21 19:06 prjemian

Lots of TimeoutError exceptions raised for area detector PVs for py36 testing: not related to this PR.

@klauer : This is ready to review again.

prjemian avatar Jun 07 '21 22:06 prjemian

Note, only the py36 (push) tests failed, not the py36(pull_request) tests. Re-running all CI jobs, just in case.

prjemian avatar Jun 07 '21 22:06 prjemian

This time on py38 (push) only. Inconsistency is on the CI server? Run again.

prjemian avatar Jun 07 '21 23:06 prjemian

Continued CI failures due to TimeoutError and EPICS area detector features -- not related to this PR.

prjemian avatar Jun 08 '21 02:06 prjemian

I’m late to this discussion, but in October we (@prjemian, @ZLLentz, and I) developed a solution to “Another set in progress” that has been approved. It just needs a rebase and a test.

https://github.com/bluesky/ophyd/pull/901#issuecomment-759019053

Reaching into the status object and calling set_finished is a clever solution, but I prefer where we landed in #901: a new method clear_set() that explicitly says “we’re canceling this because the part of the code that was supposed to mark it as either finished or errored never did so.” I like that it can issue a warning with specifics, so that if it happens often and perhaps one or two layers down in the code, there is a way for controls staff to become aware of a recurring issue.

If we could nudge #901 across the finish line, would it solve your problem, @prjemian?

danielballan avatar Jun 08 '21 11:06 danielballan

I agree with @danielballan , #901 is a better solution to this problem and points people at the actual problem (that a set method is broken) while still providing a "get-on-with-life" escape hatch.

tacaswell avatar Jun 08 '21 14:06 tacaswell

@danielballan Thanks, big time, for picking that up. I'll pivot to #901. If that resolves the OP (https://github.com/APS-USAXS/ipython-usaxs/issues/501, where a short-term work-around was implemented), then I can drop this PR.

prjemian avatar Jun 08 '21 14:06 prjemian

dropping this branch & PR

prjemian avatar Mar 29 '23 21:03 prjemian