storm icon indicating copy to clipboard operation
storm copied to clipboard

[STORM-3702] Change thrift exception classes to contain getMessage() method

Open bipinprasad opened this issue 4 years ago • 9 comments

What is the purpose of the change

*Make TException classes consistent with Exception class and eventually remove Wrapped exception classes.

How was the change tested

Recompile and run tests

bipinprasad avatar Sep 17 '20 20:09 bipinprasad

Since it is possible to break backwards compatibility, I am more conservative on this and would like more tests to be done.

(1) Did we get a chance to test with old clients? (2) And did we get a chance to test python code (client and topology code written in python) since this PR added storm_exception.py? (3) Will it break backwards compatibility for server-side plugins if they invoke old get_msg method? (4) Or is it possible that some of the topology code invokes some "exception.get_msg" method in bolt/spout, and when the topology code runs on the upgraded cluster, throwing NoSuchMethod error?

For (3-4), my question is more like: if there is any case like these cases. I am currently not sure.

Thanks

Ethanlm avatar Sep 18 '20 14:09 Ethanlm

(1) No new tests have been added. Following code that uses get_msg() was changed as part of this change - java: https://github.com/apache/storm/pull/3337/commits/685c81743bc4feb7a11fcbe94d1cd79e0228fd65 and - clojure (test): https://github.com/apache/storm/pull/3337/commits/5ea0ac6558244a49bf935c00c7d6d1930dbd7525 (2) No new python testing was done - other than what is already part of the test suite (if any) (3) Not sure I understand what server-side plugin is. (4) Any topology (old code) should have its own (presumably old) version of TException class from its storm-client jar. Which will have get_msg() method. The new class and the old class are binary compatible (at thrift level). So that code "should" work.

I can do a more comprehensive search of get_msg() usage and see if something was overlooked - especially python code that may not cause compile or test error without proper code coverage.

bipinprasad avatar Sep 18 '20 15:09 bipinprasad

(1) For old clients, I am thinking of users using old storm client to submit topologies to upgraded cluster.

(2) For old python code, I am thinking of

  • topologies depending on older storm version (with older storm python module)
  • older version of python client connecting to upgraded storm cluster

This can't be tested on unit tests.

(3) By server-side plugin, I meant plugins that storm users can implement for their cluster, for example, nimbus.topology.validator. These are couple of server-side plugins that users can implement. If those plugins invokes get_msg, it will not continue to work on upgraded cluster.

(4) The topology code depends on storm-client.jar as provided dependency. But when it is running on the cluster, it uses the cluster provided storm-client.jar. So a possible issue is that in the topology code, it invokes get_msg somewhere (since it depends on older storm-client.jar), but when it is running on upgraded cluster, storm-client.jar doesn't have get_msg method, and hence NoSuchMethod error.

Theoretically (3-4) could happen. But what I am currently not sure is that if anyone would implement a server-side plugin that invokes get_msg somewhere in their code or if anyone would have their topology invokes get_msg somewhere in their bolt/spout?

I am more confident with it if we can do comprehensive tests first. But this is not urgent. And I will also try to find some time to do some tests too. Thanks

Ethanlm avatar Sep 18 '20 15:09 Ethanlm

Thanks for the explanation. Only way to check item (4) is to do a "binary" check on the topology jar to test for incompatible use (i.e. use of these specific exception.get_msg() call. I need to think about how to do this properly.

bipinprasad avatar Sep 18 '20 16:09 bipinprasad

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

bipinprasad avatar Sep 18 '20 20:09 bipinprasad

@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under TException. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.

kishorvpatil avatar Sep 21 '20 13:09 kishorvpatil

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

Another question I have related to python is

since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it import storm explicitly which includes exception classes before this PR)

Ethanlm avatar Sep 22 '20 14:09 Ethanlm

@bipinprasad As discussed, I feel all the thrift API methods thould wrap these exception under TException. So we can consistently ensure message/cause is passed down to the client side. Currently sending just message is not sufficient/not follow typical exception criteria of providing cause.

I would like to learn more about this approach. The current way of passing an exact exception class (like NotAliveException) seems fine with me

Ethanlm avatar Sep 22 '20 14:09 Ethanlm

  • I did comprehensive search: grep get_msg/set_msg on **/.py, **/.java, **/*.clj. All calls are accounted for this change request.
  • So this leaves only the user topology jars, which may also include the plugins.

Another question I have related to python is

since this PR separates out storm_exception/*.py, will code like splitsentence.py https://github.com/apache/storm/blob/master/examples/storm-starter/multilang/resources/splitsentence.py#L16 work if it tries to access these exception classes? Does it need to import storm_exception explicitly? (I see it import storm explicitly which includes exception classes before this PR)

Indeed. Good point. Ideally python import should remain unchanged - i.e. just single import.

bipinprasad avatar Sep 22 '20 14:09 bipinprasad