fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

Make sure JSON parser returns Hash

Open daipom opened this issue 1 year ago • 6 comments

Which issue(s) this PR fixes: Fixes #4100

What this PR does / why we need it: Make sure JSON parser returns Hash. It is wrong for Parser to return a record that is not Hash. Subsequent processing may result in errors.

The result of parsing JSON can be a value other than Hash. Originally, the data represented by JSON was not limited to Hash.

  • https://www.rfc-editor.org/rfc/rfc8259#section-2

The current implementation of JSONParser does not appear to have sufficiently considered this.

We need to consider carefully the specification of the case where raw data can not be parsed as Hash.

The currently intended usecases appears to be Hash or Array of Hash. We should consider only these 2 patterns as parsable for now.

in_http assumed that Parser could return Array. Therefore, it needed to be fixed along with this.

MessagePackParser appears to have the same issue. (Not fixed in this PR.)

Docs Changes: Not needed.

Release Note: Same as the title.

Difference

Case: No subsequent processing

Config

<source>
  @type tcp
  tag test.tcp
  <parse>
    @type json
  </parse>
</source>

<match test.**>
  @type stdout
</match>

Operation

$ netcat 0.0.0.0 5170
[{"k":"v"}, {"k2":"v2"}]

Result before this fix

2023-03-21 23:30:30.623063804 +0900 test.tcp: [{"k":"v"},{"k2":"v2"}]

Result after this fix

2023-03-21 23:29:20.147728222 +0900 test.tcp: {"k":"v"}
2023-03-21 23:29:20.147754433 +0900 test.tcp: {"k2":"v2"}

Case: With subsequent filter processing

Config

<source>
  @type tcp
  tag test.tcp
  <parse>
    @type json
  </parse>
</source>

<filter test.**>
  @type record_transformer
  enable_ruby true
  <record>
    class ${record.class}
  </record>
</filter>

<match test.**>
  @type stdout
</match>

Operation

$ netcat 0.0.0.0 5170
[{"k":"v"}, {"k2":"v2"}]

Result before this fix

2023-03-21 23:30:51 +0900 [warn]: #0 dump an error event: error_class=NoMethodError error="undefined method `merge!' for [{\"k\"=>\"v\"}, {\"k2\"=>\"v2\"}]:Array" location="/home/daipom/work/fluentd/fluentd/lib/fluent/plugin/filter_record_transformer.rb:135:in `reform'" tag="test.tcp" time=2023-03-21 23:30:51.040692881 +0900 record=[{"k"=>"v"}, {"k2"=>"v2"}]

Result after this fix

2023-03-21 23:29:55.783183244 +0900 test.tcp: {"k":"v","class":"Hash"}
2023-03-21 23:29:55.783215076 +0900 test.tcp: {"k2":"v2","class":"Hash"}

daipom avatar Mar 21 '23 02:03 daipom