nbval icon indicating copy to clipboard operation
nbval copied to clipboard

nbval and ipyparallel

Open FabioLuporini opened this issue 6 years ago • 8 comments

Hi, I wonder whether it's possible to use nbval for a jupyter notebook that exploits ipyparallel in combination with MPI (mpi4py).

This is the notebook I'm talking about. It's nothing special -- you can stop reading at cell 2

  1. I'm seeing failures when running with nbval, not sure if the fault is mine or what, still to be investigated properly (the error trace is here, starting at around line 300)... so for now you might ignore this...I think... but...
  2. how can/should I use things like #NBVAL_IGNORE_OUTPUT in combination with ipyparallel's magic %%px ? both are supposed to appear at the very top of a cell

Thanks!

FabioLuporini avatar Jun 03 '19 21:06 FabioLuporini

I'm not so sure on the parallel stuff, but the marker comments for nbval can be anywhere in the cell. You can also use cell tags instead of comments: https://nbviewer.jupyter.org/github/computationalmodelling/nbval/blob/0.9.1/docs/source/index.ipynb#Using-tags-instead-of-comments

takluyver avatar Jun 03 '19 21:06 takluyver

thanks. I'll try this and will keep digging. Gimme another couple of days before closing the issue alright? Maybe I can report more

FabioLuporini avatar Jun 04 '19 08:06 FabioLuporini

I'm closing this for now. Thanks!

FabioLuporini avatar Jun 05 '19 11:06 FabioLuporini

Sorry, I feel like I have to reopen this issue because I don't really know how to fix it

I keep seeing this kind of error from random cells:

Input:
%%px --block --group-outputs=engine‌
u.data[0, 1:-1, 1:-1] = 1.
u.data

Traceback:‌
Unexpected output fields from running code: {'stdout'}‌

Sometimes our CI is green, sometimes it's red due to one random cell failing as per above THe traceback is always the same. This happens even in cells which are not supposed to print anything to stdout (e.g.., cells only changing entries in a dictionary)

When does nbval exactly check the output of a cell? is it possible that nbval performs the output check when one process has returned, while the others have not yet? or something along these lines ? I'm really at a loss. At this point, any sort of information would be greatly appreciated.

FabioLuporini avatar Jun 12 '19 21:06 FabioLuporini

nbval checks the output when the cell has finished running. This usually means that the execute_reply message has been sent on the shell channel and an idle status message has been sent on the iopub channel. It doesn't know anything specific about ipyparallel - it sends that cell to the kernel, where ipyparallel processes the %%px cell magic and does whatever it needs to do with that.

I can't see any obvious reason why that cell would behave randomly. But I'm not super familiar with ipyparallel.

takluyver avatar Jun 13 '19 08:06 takluyver

I'm still investigating the issue.

After forking ipyparallel and nbval, I found out that the (randomly) failing cell is getting an unexpected message of type stream from the ipython kernel.

These are the messages received on iopub while processing the failing cell ; the third one is the "unexpected" one.

I have no idea why sometimes this bug appears and sometimes not.

I should add that it seems that always the same cells cause the failure (in common they have that some custom __setitem__ is being executed, see 2nd message in the link above (note that u.data is not a numpy array, but rather a custom subclass))

Also, I can't reproduce this on my local machine (which makes debugging horribly painful); this only appears on our CI system (azure pipelines). I don't know if there's a timing issue somehow

EDIT: I wonder whether this might be relevant...

FabioLuporini avatar Jun 18 '19 10:06 FabioLuporini

That issue does look potentially relevant. "got unknown result" is a message from ipyparallel when it gets a reply to a message ID which is not in self.outstanding:

https://github.com/ipython/ipyparallel/blob/6.2.4/ipyparallel/client/client.py#L766

takluyver avatar Jun 18 '19 10:06 takluyver

yes I saw that. Just can't figure out why it sometimes appears, and sometimes not

FabioLuporini avatar Jun 18 '19 11:06 FabioLuporini