logstash-codec-json icon indicating copy to clipboard operation
logstash-codec-json copied to clipboard

The logstash 'json' codec still requires a newline '\n' as a delimiter to terminate json logs being sent over a TCP input.

Open biox opened this issue 9 years ago • 9 comments

Description: The logstash 'json' plugin still requires a newline '\n' to terminate json logs being sent over a TCP input. (UDP appears to work fine)

This is contradictory to the documentation(https://www.elastic.co/guide/en/logstash/current/plugins-codecs-json.html), which states "If you are streaming JSON messages delimited by \n then see the json_lines codec."

Reproduction:

  1. Set up a basic TCP input in logstash using the 'json' codec:
   tcp {
   type => 'import_json'
   tags => 'import_json'
   port => 2057
   codec => json
   }
  1. Using a small python script, send a test json log to your logserver:
# !/usr/bin/python

import socket
try:
        import json
except ImportError:
        import simplejson as json

logserver_ip   = '192.168.5.100'
logserver_port = 2057
json_message   = {}

json_message['message']    = 'test'
json_message['sourcetype'] = 'Appl-Test'
json_message['logfile']    = '/tmp/test.log'

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((logserver_ip, logserver_port))
# s.send(json.dumps(json_message))

s.send((json.dumps(json_message) + '\n'))
s.close()

NOTE: The python plugin is using newlines at the end of it as described above. Substitute your own IP address/port where necessary. Call the script like so: python scriptname.py Logs arrive properly because the json events are terminated using the newline.

  1. Adjust the script so that the newline is left out:
#!/usr/bin/python

import socket
try:
        import json
except ImportError:
        import simplejson as json

logserver_ip   = '192.168.5.100'
logserver_port = 2057
json_message   = {}

json_message['message']    = 'test'
json_message['sourcetype'] = 'Appl-Test'
json_message['logfile']    = '/tmp/test.log'

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((logserver_ip, logserver_port))
s.send(json.dumps(json_message))
# s.send((json.dumps(json_message) + '\n'))

s.close()

NOTE: Logs no longer arrive properly because they aren't terminated using a newline.

Additional Information:

The UDP input does not display this issue.

biox avatar Oct 13 '15 21:10 biox

Any update here?

biox avatar Dec 11 '15 19:12 biox

Bump, this is still a problem.

biox avatar Jan 13 '16 21:01 biox

Another vote here from the java BufferedReader world, trying to use readLine() as in the past....

thealy avatar Dec 20 '16 20:12 thealy

Having this issue, too.

magic69 avatar Jan 26 '17 22:01 magic69

I've been able to work around this issue with the json_lines codec. https://www.elastic.co/guide/en/logstash/current/plugins-codecs-json_lines.html

My use case is a bit strange: I'm sending events from one logstash to another over tcp.

Sample config: Logstash 1

input {
  beats {
    port => 5044
  }
filter {
  # filtery stuff
}

output {
  tcp {
      host => "myhostname@someserver"
      port => 5044 
      codec => json_lines
    }
}

Logstash 2

input {
  tcp {
    port => 5044
	codec => json_lines  # codec => json also works, but no codec at all will result in your message field being its own json string which is nasty
  }
}

filter {
  # moar filtery stuff
}

output {
  stdout{
     codec => json_lines # for debugging
  }
  file {
    path => "/my/output/path"
    codec => json_lines   # can be whatever you want ( I'm reading the json in Spark and therefore newline delimiting makes life easy)
 }
}

Curiously enough, input configurations using tcp with standard json codec are automatically switched to the json_lines codec in at least logstash 5.4.x (I suspect someone at Elastic noticed and decided to kind of help out without adding docs) Logstash 2 stdout

[2017-07-05T12:14:54,986][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>5, "pipeline.batch.size"=>250, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>1250}
[2017-07-05T12:14:54,990][INFO ][logstash.inputs.tcp      ] Automatically switching from json to json_lines codec {:plugin=>"tcp"}
[2017-07-05T12:14:54,993][INFO ][logstash.inputs.tcp      ] Starting tcp input listener {:address=>"0.0.0.0:5044"}
[2017-07-05T12:14:54,994][INFO ][logstash.pipeline        ] Pipeline main started

Unfortunately I also needed json_lines on the tcp output, and that doesn't seem to be switched automagically, so I had to guess that part.

wadejensen avatar Jul 05 '17 12:07 wadejensen

I had this problem to. I Took a few hours trying to make the logstash works. For my luck, i found this issue and the '\n' saved my day!

antunesleo avatar Dec 18 '17 15:12 antunesleo

Having this issue too, although the '/n' made it work.

RRSR avatar May 03 '18 05:05 RRSR

Any news regarding a proper fix to this?

sevdog avatar Jul 23 '18 09:07 sevdog

Maybe this is because the TCP input plugin doesn't generate the event until it receives a newline? From the docs:

Like stdin and file inputs, each event is assumed to be one line of text

So the input generates one event per line of text, meaning that newlines are still required. I also suppose that it removes the trailing newline from the generated event, so json_lines shouldn't work with this input properly. I haven't tried this setup yet, though.

skozin avatar Jan 22 '20 13:01 skozin