A few questions/issues
First of all, thanks for the great library. This has been great to use so far. When I was checking it out I ran into several situations, in which I wonder if they're bugs or if it's my limited experience with this library. I thought I'd put them in one issue here, but I can turn them into different issues if desired.
The most important one was a case which I ran into where I was posting callbacks on an event loop in another thread. This just hang my whole client after a while. I tried creating a minimally reproducing example which I attached here.
Minimal example test_client.zip
It's just a very simple client/server. But the client will hang on the last request to:
# run something on the pycapnp thread
# now this one will hang
s.run_on_loop(run_test_on_loop(s.tif, 'test01'))
Probably there's something weird going on here with the threading, but I made very sure that everything capnp related is only done in one single thread. That's why it's posting callbacks on the event loop. So I'm a bit at a loss here what's causing this. When I terminate the client (when it hangs), this is the error:
2020-04-15 21:50:46,012 [client ] [Thread-1 ] background thread started
2020-04-15 21:50:46,013 [asyncio ] [Thread-1 ] Using selector: EpollSelector
2020-04-15 21:50:46,049 [client ] [Thread-1 ] after login ( test = <external capability> )
2020-04-15 21:50:46,068 [client ] [Thread-1 ] onEvent [<test_if_capnp:TestStruct reader (a = 1, b = ["test1", "ef", "test1"], c = 2, d = 3, e = vala)>]
2020-04-15 21:50:46,084 [client ] [Thread-1 ] onEvent [<test_if_capnp:TestStruct reader (a = 1, b = ["test1", "ef", "test1"], c = 2, d = 3, e = vala)>, <test_if_capnp:TestStruct reader (a = 2, b = ["test2", "ef", "test2"], c = 2, d = 3, e = vala)>]
2020-04-15 21:50:46,528 [client ] [Thread-1 ] onEvent [<test_if_capnp:TestStruct reader (a = 1, b = ["test1", "ef", "test1"], c = 2, d = 3, e = vala)>, <test_if_capnp:TestStruct reader (a = 2, b = ["test2", "ef", "test2"], c = 2, d = 3, e = vala)>, <test_if_capnp:TestStruct reader (a = 3, b = ["test00", "ef", "test00"], c = 2, d = 3, e = vala)>]
Traceback (most recent call last):
File "/home/florisvannee/workspace/optibook/examples/test_client.py", line 120, in <module>
s.run_on_loop(run_test_on_loop(s.tif, 'test01'))
File "/home/florisvannee/workspace/optibook/examples/test_client.py", line 85, in run_on_loop
ret = fut.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 434, in result
self._condition.wait(timeout)
File "/usr/local/lib/python3.8/threading.py", line 302, in wait
waiter.acquire()
KeyboardInterrupt
terminate called after throwing an instance of 'kj::ExceptionImpl'
what(): kj/async.c++:317: failed: expected head == nullptr; EventLoop destroyed with events still in the queue. Memory leak?; head->trace() = kj::_::ChainPromiseNode
kj::_::TransformPromiseNode<kj::Promise<void>, kj::_::Void, capnp::TwoPartyVatNetwork::OutgoingMessageImpl::send()::{lambda()#1}, kj::_::PropagateException>
kj::_::EagerPromiseNode<kj::_::Void>
stack: 7f686fcf479f 7f686fb27cf5 44844d 4634c4 473524 504478 519bfb 42c46c 7f687a740504 42b263
Second, some small issues/questions:
- In callbacks, the parameters that are passed only have the lifetime of that method. That's why, in the example I provided in
TestFeed, I do acopy.deepcopybefore I store them in a list. However, iscopy.deepcopysupposed to work all the time? I've had cases where doing the deepcopy would throw, or where the deep copy would succeed, but accessing the object later would throw. Is there any way to just turn the capnp object into a pure Python object for further processing / storing? Unfortunately I don't have a small reproducible example of this. If I can get one I'll post it. - In callbacks (like the one in
TestFeed), if something throws, the error is not logged in the client (funnily enough, it is sent to the server which reports it). Ideally, I'd want exceptions in the callbacks to just throw (or at least log) exceptions in the client. Is this possible? - Callbacks (or any server implementation methods) are by default synchronous. It's not possible to make them
async. I think this would be a nice feature to have. Eg. intest_server.pyI now have aasyncio.tasks.ensure_futurecall which would be able with anawaitif it was anasyncmethod itself.
Thanks for the effort put into making this library what it is! :-)
I gave this a try with the gdb debugger but can't figure out what's going wrong. It looks like it's reaching the capnp code where it sends the last message in:
RemotePromise<DynamicStruct> Request<DynamicStruct, DynamicStruct>::send()
But somehow this still never gets properly sent, because the server never receives this message. I'm a bit at a loss unfortunately.
I'm not the original author, but I have seen some discussions around related to both pycapnp and capnp that indicate deepcopy (and copying objects) may not behave as you'd expect.
I'm conjecturing, but there's likely a bunch of state included with the datastructures. So if you want to keep it around you'll want to copy the data itself rather than the object. There might be an easier way to do this though.
Thanks! I believe I can answer some of these questions myself now:
- While
deepcopyindeed doesn't work very well, there is actually ato_dictmethod on the reader objects that convert them to plain Python dictionaries. However, the implementation has a bug that leads to an infinite recursive call. I'll create a PR that fixes this. - The hanging when doing the callback on a different thread, was because the result of the call was still an internal library object and this result was passed back to the calling thread. This is something tha the library cannot handle as somehow copying/destruction or so is not handled well in this case. This leads to a promise never resolving. It's fixed by not passing the capnp result, but instead the raw dictionary obtained from the
to_dictcall.
Please have a look at https://github.com/capnproto/pycapnp/pull/213 if you time