cdo-bindings icon indicating copy to clipboard operation
cdo-bindings copied to clipboard

Fix signal interruption issues and related minor changes

Open lukelbd opened this issue 2 years ago • 13 comments

This seems to fix the issue where kill/interrupt signals outside of the cdo binding are suppressed mentioned in #26 and #43.

It also fixed a few quick-to-repair issues I came across during debugging:

  • I fixed a basic usage issue with recent versions of cdo (version 2.0.0+?). It seems some versions of cdo print the -V version info to stdout rather than stderr, which triggered an error when trying to initialize Cdo() because the binding was trying to parse the empty stderr output proc.communicate()[1]. I fixed this for all versions by always sending the output to stdout (using stderr=subprocess.STDOUT instead of stderr=subprocess.PIPE).
  • I fixed #38 so that help info is shown (seems __doc__ has to be overridden on the instance-level rather than the class-level). I also removed the unused (outdated?) auto_doc function and added __name__ = method_name so that the help prompt for e.g. help(cdo.merge) shows merge(...) rather than <cdo.Cdo.__getattr__.<locals>.Operator object>.
  • I fixed subcommand autocompletion, which wasn't working before. Now when I type cdo.mer, rather than nothing happening, the error message is AttributeError: Unknown operator 'mer'! Did you mean: meravg, merge, mergegrid, mergetime, merkurt, mermax, mermean, mermedian, mermin, merpctl, merrange, merskew, merstd, merstd1, mersum, mervar, mervar1?.

Here's a basic example, hitting Ctrl+C after a couple seconds:

In [1]: import time
   ...: from cdo import Cdo
   ...:
   ...: #cdo = Cdo()
   ...: while True:
   ...:     print("hi")
   ...:     time.sleep(1)
hi
hi
^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Input In [1], in <module>
      5 while True:
      6     print("hi")
----> 7     time.sleep(1)

KeyboardInterrupt:

Here's after initializing the binding, before this PR:

In [3]: import time
   ...: from cdo import Cdo
   ...:
   ...: cdo = Cdo()
   ...: while True:
   ...:     print("hi")
   ...:     time.sleep(1)
hi
^Ccaught signal <cdo.Cdo object at 0x10e6f88b0> 2 <frame at 0x7fcca6031670, file '<ipython-input-3-873ff4417816>', line 7, code <module>>
hi
^Ccaught signal <cdo.Cdo object at 0x10e6f88b0> 2 <frame at 0x7fcca6031670, file '<ipython-input-3-873ff4417816>', line 7, code <module>>
hi
^Ccaught signal <cdo.Cdo object at 0x10e6f88b0> 2 <frame at 0x7fcca6031670, file '<ipython-input-3-873ff4417816>', line 7, code <module>>

Here's after initializing the binding, after this PR:

In [1]: import time
   ...: from cdo import Cdo
   ...: cdo = Cdo()
   ...: while True:
   ...:     print("hi")
   ...:     time.sleep(1)
hi
hi
^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Input In [1], in <module>
      4 while True:
      5     print("hi")
----> 6     time.sleep(1)

File ~/forks/cdo-bindings/python/cdo.py:664, in Cdo.__catch__(self, signum, frame, default)
    662 self.tempStore.__del__()
    663 if callable(default):
--> 664     default()
    665 else:
    666     print("caught signal", self, signum, frame)

KeyboardInterrupt:

lukelbd avatar Apr 27 '22 23:04 lukelbd

Closes #26, closes #38, closes #43

lukelbd avatar Apr 27 '22 23:04 lukelbd

I guess this also just re-implements the stdout fix mentioned in #39 on the master branch.

lukelbd avatar Apr 28 '22 01:04 lukelbd

I've also added an auto-adjusting __qualname__ so that e.g. typing cdo.merge in the repr shows <cdo.Cdo.merge at 0x1795d0d00>, typing cdo.copy.merge shows <cdo.Cdo.copy.merge at 0x17975f910>, etc. Previously typing either of these in the repr showed something like <cdo.Cdo.__getattr__.<locals>.Operator at 0x1796ed250>.

lukelbd avatar May 03 '22 04:05 lukelbd

I was running into some strange errors when __catch__ tried to call the default signal handler inside of chained operators, so have decided to move the signal handling from Cdo to CdoTempfileStore and add an optional tempStore keyword to Cdo. Now when Operator.__get__ is invoked the same tempStore initially generated by the Cdo instance is used. I also removed the (it seems to me) unnecessary __tempdirs object in order to simplify CdoTempfileStore.__del__.

I think moving the signal.signal(...) stuff to the temp file handler and calling it only once per Cdo session should eliminate the errors I was getting. But haven't been able to reproduce them so can't be sure.

lukelbd avatar May 04 '22 01:05 lukelbd

And here's a final style tweak that removes alignment-style whitespace in compliance with pep8 and simplifies syntax by removing unnecessary parentheses arounf if statements and removing unnecessary invocation of dict.keys() (since by default things like in dict and for x in dict already operate on dictionary keys).

lukelbd avatar May 04 '22 01:05 lukelbd

So I've been working from this branch for my research and haven't run into any problems since that signal change. @Try2Code do you have any comments / anything I should change before merging? Also sorry for lumping in a few different things in one PR I was just trying to get these done quickly, it was a bit easier to use one branch.

lukelbd avatar May 04 '22 22:05 lukelbd

hi! do you use the master branch for your stuff?

Try2Code avatar May 05 '22 07:05 Try2Code

You mean the cdo-bindings/master branch? Yep I do! This branch is forked from master.

lukelbd avatar May 05 '22 16:05 lukelbd

@Try2Code Not sure if I understood your question correctly? Let me know if there's anything else I should change. Have been using this branch without issues so far.

lukelbd avatar May 22 '22 00:05 lukelbd

my point is that the master branch currently does not represent the official release. Some concepts are in the master, which I need to work on further more. I need to know where to put your change - that's why I was asking what u use

Try2Code avatar May 23 '22 07:05 Try2Code

Gotcha -- yep these changes are solely on top of your master branch, so they belong in master.

lukelbd avatar May 23 '22 21:05 lukelbd

I created this separate branch fix-signals off of the most recent version of your master branch so that it describes the purpose of this PR, in accordance with the standard github fork-pull request workflow.

lukelbd avatar May 23 '22 21:05 lukelbd

@Try2Code Just checking in, any interest in still merging this or things I should add?

lukelbd avatar Aug 10 '22 22:08 lukelbd

hi @lukelbd !

sorry for not responding for such a long time. I started a merge of master and the maintenance-1.5.x branch to get rid of these two development lanes in the summer, but did not manage to finish it, yet.

You provide a lot of interesting fixes and the behavior seem to be a lot better with them. I have to admit, that I am still not sure about the usefulness of something like cdo.copy.merge().... - it looks good, but combining operators with different number of input and output streams is not easily covered by that syntax.

My biggest concern is the movement of the signal handler to the tempstore. The handling of tempfiles has essentially nothing to do with signal handling. It's only useful property is that there is only a single tempstore. I believe this is the main thing needed for proper signal handling - do it once, only. Class methods might be better for that :thinking:

I need to test the handling with jupyterhub, too. In this environment ppl might have jobs running without knowing so that a queue manager like cron or slurm might end up canceling them.

Try2Code avatar Feb 21 '23 11:02 Try2Code

No worries! That's how open source goes sometimes (I'm often just as delayed).

Re: chained operators, note that statements like cdo.copy.merge() or cdo.copy.add() were already implemented -- I didn't add functionality. I just made it so that typing subcommands / chained commands into a REPL shows the subcommand or chained command name instead of a strange / uninformative default object name. So now we have:

>>> cdo.copy.add
<cdo.Cdo.copy.add at 0x17975f910>

...instead of:

>>> cdo.copy.add
<cdo.Cdo.__getattr__.<locals>.Operator at 0x1796ed250>

Re: moving signal handler to tempstore -- my main motivation is that the only function of cdo's signal handling seems to be removing tempstore files, so it made sense to me to move it there. Also, previously, a CdoTempfileStore was created every time a nested command was called and instantiated (since __init__ did not accept a tempStore keyword)...

https://github.com/Try2Code/cdo-bindings/blob/ca5928bb9b37083c48694773a3e1bc476fa89c17/python/cdo.py#L572-L579

https://github.com/Try2Code/cdo-bindings/blob/ca5928bb9b37083c48694773a3e1bc476fa89c17/python/cdo.py#L214-L236

...so before, there were many tempStore instances. I think temporary file deletion might have still worked because the same tempdir was used with each one, although the signal handler was overwritten every time a cdo object was created, which IIRC caused other problems when trying to debug this issue (hence the change).

Anyway is that justification valid? I think the main idea is we want to both 1) create a temporary file-handling class tempStore only once, and 2) implement temporary file-handling signal overrides only once... and do both of these only when the first/main Cdo instance is created. Which this PR seems to accomplish. Not sure I understand the threading concerns though... isn't this still ok since I still check for current_thread() is main_thread()?

lukelbd avatar Mar 12 '23 21:03 lukelbd

@lukelbd thx for the explanations and your indulgence with me being always late

Try2Code avatar Mar 29 '23 10:03 Try2Code