streamparse icon indicating copy to clipboard operation
streamparse copied to clipboard

storm crashes when a print command is issued in a bolt process function

Open sapristi opened this issue 6 years ago • 6 comments

Hello, I noticed that storm crashes when a print is issued in a bolt process function (or by a function of object init method called in the process function).

Here is the first line of the error that appears : java.io.IOException: Unexpected character (b) at position 0. at org.apache.storm.multilang.JsonSerializer.readMessage(JsonSerializer.java:172) ~[storm-core-1.2.1.jar:1.2.1]

To replicate the error, just add a print statement in the process function of the wordcount.py bolt in the wordcount quickstart projet.

What is strange is that when the error happened to me, the bolt would run multiple times before crashing, seemingly randomly.

Here is the full error log : log.txt

sapristi avatar May 22 '18 19:05 sapristi

Hi @sapristi! The issue here is that Storm uses stdin and stdout to communicate between Java and Python processes. Your print call wrote data to stdout in a format Java does not expect, resulting in the traceback.

Pystorm (a project Streamparse uses to implement components in Python), has an attempt to fix the stdout print problem, though it should have been initialized by the time the process function ran. https://github.com/pystorm/pystorm/blob/3c91e04675b3d2a2d1facb6c19908b82b018086a/pystorm/component.py#L279-L282. What version of Python did you attempt to run on?

In general we recommend you use the self.logger object provided by the spouts and bolts, as seen here: https://github.com/Parsely/streamparse/blob/master/examples/redis/src/bolts.py#L23.

codywilbourn avatar May 22 '18 20:05 codywilbourn

Thanks for your answer, the error arose because I was personal libraries that issued print statements. I am using Python 3.6.

The error did happen after the bolt successfully processed multiple tuples :

My bolt issued multiple logs of the form 28389 [Thread-56] INFO o.a.s.t.ShellBolt - ShellLog pid:22408, name:graph_bolt 2018-05-22 14:44:40,998 - pystorm.component.graph_bolt - produced graph :

before crashing with 29756 [Thread-56] ERROR o.a.s.t.ShellBolt - Halting process: ShellBolt died. Command: [streamparse_run, -s json bolts.GraphBolt.GraphBolt], ProcessInfo pid:22408, name:graph_bolt exitCode:-1, errorString: java.io.IOException: Unexpected character (n) at position 0.

(this is just for information, I understand the problem will not be fixed now :)

sapristi avatar May 22 '18 20:05 sapristi

Here is a full log if that can help : log.txt

sapristi avatar May 22 '18 20:05 sapristi

@sapristi how to solve it? please

tianser avatar Apr 17 '19 10:04 tianser

@tianser I don't remember at all, but from what I read in my code I just had a switch to disable logs. Also codywilbourn seems to point to a possible solution.

sapristi avatar Apr 17 '19 19:04 sapristi

@sapristi bolts read stream from strom spout; we must do like below; ` //msg_dict = json.loads(word.encode("utf-8")) //ERROR

word_utf = word.encode("utf-8") msg_dict = json.loads(word_utf) `

tianser avatar Apr 18 '19 07:04 tianser