storm
storm copied to clipboard
[STORM-3702] Change thrift exception classes to contain getMessage() method
What is the purpose of the change
*Make TException classes consistent with Exception class and eventually remove Wrapped exception classes.
How was the change tested
Recompile and run tests
Since it is possible to break backwards compatibility, I am more conservative on this and would like more tests to be done.
(1) Did we get a chance to test with old clients?
(2) And did we get a chance to test python code (client and topology code written in python) since this PR added storm_exception.py?
(3) Will it break backwards compatibility for server-side plugins if they invoke old get_msg
method?
(4) Or is it possible that some of the topology code invokes some "exception.get_msg" method in bolt/spout, and when the topology code runs on the upgraded cluster, throwing NoSuchMethod error?
For (3-4), my question is more like: if there is any case like these cases. I am currently not sure.
Thanks
(1) No new tests have been added. Following code that uses get_msg() was changed as part of this change - java: https://github.com/apache/storm/pull/3337/commits/685c81743bc4feb7a11fcbe94d1cd79e0228fd65 and - clojure (test): https://github.com/apache/storm/pull/3337/commits/5ea0ac6558244a49bf935c00c7d6d1930dbd7525 (2) No new python testing was done - other than what is already part of the test suite (if any) (3) Not sure I understand what server-side plugin is. (4) Any topology (old code) should have its own (presumably old) version of TException class from its storm-client jar. Which will have get_msg() method. The new class and the old class are binary compatible (at thrift level). So that code "should" work.
I can do a more comprehensive search of get_msg() usage and see if something was overlooked - especially python code that may not cause compile or test error without proper code coverage.
(1) For old clients, I am thinking of users using old storm client to submit topologies to upgraded cluster.
(2) For old python code, I am thinking of
- topologies depending on older storm version (with older storm python module)
- older version of python client connecting to upgraded storm cluster
This can't be tested on unit tests.
(3) By server-side plugin, I meant plugins that storm users can implement for their cluster, for example, nimbus.topology.validator
. These are couple of server-side plugins that users can implement. If those plugins invokes get_msg
, it will not continue to work on upgraded cluster.
(4) The topology code depends on storm-client.jar
as provided dependency. But when it is running on the cluster, it uses the cluster provided storm-client.jar
. So a possible issue is that in the topology code, it invokes get_msg
somewhere (since it depends on older storm-client.jar
), but when it is running on upgraded cluster, storm-client.jar
doesn't have get_msg
method, and hence NoSuchMethod error.
Theoretically (3-4) could happen. But what I am currently not sure is that if anyone would implement a server-side plugin that invokes get_msg
somewhere in their code or if anyone would have their topology invokes get_msg
somewhere in their bolt/spout?
I am more confident with it if we can do comprehensive tests first. But this is not urgent. And I will also try to find some time to do some tests too. Thanks
Thanks for the explanation. Only way to check item (4) is to do a "binary" check on the topology jar to test for incompatible use (i.e. use of these specific exception.get_msg() call. I need to think about how to do this properly.
- I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
- So this leaves only the user topology jars, which may also include the plugins.
@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under TException
. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.
- I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
- So this leaves only the user topology jars, which may also include the plugins.
Another question I have related to python is
since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it import storm
explicitly which includes exception classes before this PR)
@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under
TException
. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.
I would like to learn more about this approach. The current way of passing an exact exception class (like NotAliveException) seems fine with me
- I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
- So this leaves only the user topology jars, which may also include the plugins.
Another question I have related to python is
since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it
import storm
explicitly which includes exception classes before this PR)
Indeed. Good point. Ideally python import should remain unchanged - i.e. just single import.