tvm
tvm copied to clipboard
[Bug] RPC Server Python Exception Can't Be Send to RPC Client
After upgrade with the V0.15.0, we found the RPC have a bug, if the RPC server have some exception, we can't see the error message in RPC client like below.
After some investigations, I found only if the exception is raise by a Python packed function will cause this issue, like the below experiment, the C++'s exception is thrown to Python side correctly, then the Python throw it again will lose the error message, the RPC client only can receive the exception but the error message in it is empty.
If the exception is happened in a pure C++ remote packed function, then the error message is correctly sent back to the RPC client.
How to reproduce ?
just hard code a raise "xxxxx" in the function begin of load_module in python/tvm/rpc/server.py, or other Python packed function, and then call it in RPC client.
@tqchen @Lunderberg Can you help to see whether it is relevant to the changes of https://github.com/apache/tvm/pull/15596? It is important for us to fix this issue, because Q1 release is coming, thanks.
Is it relevant to this change?
maybe indeed related to #15596 @Lunderberg seems we need to stringify python errors if they are caught by RPC
Agreed. The full stack trace is in the python object, so we should be able to serialize it for RPC use. Prior to #15596, the full stack trace was embedded into the string error message, which worked with RPC transfers, but made it quite difficult to track nested levels of error messages.
I'm thinking that RPC's serialization will be the reverse of the parsing that occurs here, so that the stack trace objects can be rebuilt when received. (Though, missing the full python variable information available within a non-RPC err.__traceback__.)
@Lunderberg can you help on this one ?
Thank you for the ping, and I probably can in a few weeks, but am currently low on available time.