metakernel icon indicating copy to clipboard operation
metakernel copied to clipboard

Error method is noncompliant to Jupyter protocol

Open JMurph2015 opened this issue 7 years ago • 39 comments

I posted this issue on the octave_kernel repo, but after digging into the source, I discovered it may be generic to all kernels based on Metakernel. So here's the issue description again: So when a kernel errors out, you should be sending back an execute-reply like this:

{
    #...
    'content':{
        'status':'error',
        'ename':<your_error_name>,
        'evalue':<your_error_data>,
        'traceback':<your_traceback>
    }
    #...
}

That way Jupyter can process that as a proper error and not just assume it is any old STDOUT. Right now it would be a the easiest (?) fix to just rename what you are already returning to 'traceback' and insert some dummy values for the fields that you can't get easily.

It would be really awesome if this were fixed because it's breaking my use case!

JMurph2015 avatar Jul 11 '17 23:07 JMurph2015

Thanks for the report! Can you give a bit more detail on your suggested fix? What do you mean by renaming "what you are already returning"? Any chance of actually making a PR?

dsblank avatar Jul 12 '17 13:07 dsblank

See also fixes and discussion here: https://github.com/Calysto/metakernel/issues/147

dsblank avatar Jul 12 '17 13:07 dsblank

Oops, I saw that #147 was actually closed, but please take a look at the comments there.

TL;DR Metakernel needs to either modify pexpect some more, or use the fd oriented class of pexpects. This is really really breaking to me, pls help.

JMurph2015 avatar Jul 12 '17 16:07 JMurph2015

Ok, let's try to nail down exactly what the issue is, and what should be done. Are you saying that if a kernel prints on stderr, should that be interpreted and returned as an error reply?

dsblank avatar Jul 12 '17 16:07 dsblank

This is a fundamental limitation of using pexpect: pty does not distinguish between stderr and stdout. Using raw fds is a road fraught with peril due to locking. I don't see a way to support this use case.

blink1073 avatar Jul 12 '17 17:07 blink1073

That's the idea. At least add that option to Metakernel and flag octave_kernel to use that option. I can't exactly ensure that if any kernel prints to stderr, that it will always be a halting error, but octave at the very least will print r".+ error: .+" if there is a halting error. (sorry that regex may be a little wrong).

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

And it's not a use case. It's a fundamental feature of being a Jupyter kernel. (that also happens to be my use case.)

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

I understand, but it is also a fundamental limitation of talking to a spawned process from python. The bash kernel has the same limitation.

blink1073 avatar Jul 12 '17 17:07 blink1073

It seems like we could at least make any pexpect kernel print out a special identifier that would signal that an error has occurred, right?

dsblank avatar Jul 12 '17 17:07 dsblank

Can Pexpect be modified to distinguish between the two (at least optionally)?

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

I've thought about some kind of API that would allow pexpect send back image data direct through some kind of mark-up codes. Seems like the same could be done to signal an error.

dsblank avatar Jul 12 '17 17:07 dsblank

And I guess the problem is that Metakernel at the moment can't even distinguish if there was an error because it's using Pexpect that presently doesn't give information about which stream some info came from. If it did that, I think the proper place for the rest of the necessary modifications would be in Metakernel.

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

@dsblank, it could be possible to thread something like that through. @JMurph2015, the pty module in core python is the one with the limitation.

blink1073 avatar Jul 12 '17 17:07 blink1073

hmmm, but Popen processes can have their streams distinguished iirc. Of course you might not be remotely based on it, but how do they do it?

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

https://docs.python.org/3/library/subprocess.html#subprocess.CompletedProcess.stderr

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

Yes, it is easy to read it after the fact. Reading from both stderr and stdout incrementally is the problem.

blink1073 avatar Jul 12 '17 17:07 blink1073

Also, without a pty you can't interrupt the Octave process, so you'd be losing that capability of the kernel as well.

blink1073 avatar Jul 12 '17 17:07 blink1073

So this may be a python core issue (assuming pty is an absolute dependency) but there shouldn't be a super good reason something generically communicating with another process shouldn't expose all streams (since they are all just file descriptors anyway)

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

/ the cpython pty module is like 170 lines long. It wouldn't be incredibly difficult for someone (that someone might be me today) modify it and make a new module, perhaps superpty (or maybe pity 😆 ) that supports segregating stdout and stderr. Edit: typo Update: Would anyone actually use said modifications? Because if Pexpect doesn't at least optionally enable it, then it's kinda pointless for me to make the module.

JMurph2015 avatar Jul 12 '17 17:07 JMurph2015

Hi! Bump on that question of whether or not the stack could pull support down from a hypothetical segregated fd pty module.

JMurph2015 avatar Jul 12 '17 20:07 JMurph2015

That would mean that pexpect would have to depend on an extension module, or we'd also have to replicate much of what pexpect is doing and depend on an extension module. I am :-1: on the idea.

blink1073 avatar Jul 12 '17 20:07 blink1073

It might perhaps depend on how much code we'd have to rely on that was outside of the standard libraries. But like @blink1073 suggests, it would not be good to fork a big chunk of code. Can be done by subclassing? Is it smallish in size? Is it something that python standard library would want to include?

dsblank avatar Jul 12 '17 20:07 dsblank

200 lines of code presently. It's unclear how much of Pexpect would need refactoring. The options I put in are entirely optional arguments, but to actually use segregated streams would definitely change at least a few things on Pexpect's side.

JMurph2015 avatar Jul 12 '17 20:07 JMurph2015

The pty core module itself is relatively tiny at 170 lines total because it is at could be described as some convenience functions that just garble the stdin, stdout, and stderr file descriptors together.

JMurph2015 avatar Jul 12 '17 20:07 JMurph2015

I'd say make a PR and let's check it out.

dsblank avatar Jul 12 '17 21:07 dsblank

@blink1073 what are the odds something like a ptyplus module could find a home in pexpect, which would do this segregated reading so that any code pexpect depends on is either in base or in pexpect?

JMurph2015 avatar Jul 12 '17 23:07 JMurph2015

/ sorry for the continued discussion here, but I'm still formulating who, what, and where these changes are going to be made (best to go in with a plan, right?). I think I can even enable pexpect's standard "read all" mode by using some trickery with piping things written onto a separate stderr to a "unified" fd that combines the three just like old times. The tricky part then is say you tell pexpect to read until EOF on the "unified" fd, then how does one know how much of that came from the stderr fd and how much of that came from the stdout fd. So I'm about halfway to a comprehensive solution. Also there are other ways the reading problem could be addressed, like reading each incrementally and waiting for both to hit EOF or some combination of conditions.

JMurph2015 avatar Jul 12 '17 23:07 JMurph2015

Or you could have it read from the unified stream and check if each chunk it reads is the next thing on the top of a stream-specific fd. edit: clarity

JMurph2015 avatar Jul 12 '17 23:07 JMurph2015

pexpect would have to be refactored to add an optional handling of stderr if that enhanced module were available. I'm not sure how it would interact with the .expect() method. We'd be expecting a prompt, but would need to handle an out-of-band stderr and potentially bail.

blink1073 avatar Jul 13 '17 12:07 blink1073

@JMurph2015 Are you sure that this is the right approach? What about a pexpect-based kernel that doesn't write to stderr? It still needs a method to signal an error. I'm still wondering about a special text signal that could signal an error, and even an API for routing text to stderr in a jupyter frontend, even if it didn't come from stderr in the external process.

dsblank avatar Jul 13 '17 13:07 dsblank