graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Rule simulation processing sample parsing logic is unclear in Graylog 5.2.4

Open miwent opened this issue 2 years ago • 6 comments

There was a change in how sample messages submitted to the rule simulator text box are parsed by the pipeline rule simulator between 5.1.x and 5.2.4, and the required formatting of JSON messages in 5.2.4 is now very complex.

Expected Behavior

The rule simulation should not require the user to do a lot of formatting work on sample messages, or at least provide an option to let the user specify how the sample message should be processed.

Current Behavior

The pipeline rule simulator has specific requirements for formatting JSON sample messages that are not clear to the user.

I added the same pipeline rule in both 5.2.4 and 5.1.10:

rule "jsonpath"
when
    true
then
    let use_data = first_non_null ( [ $message.message, $message.results ] );
    
    let json = parse_json ( to_string ( use_data ) );

    let jpath = select_jsonpath (
        json: json,
        paths: {
            path1: "$.results..login.username"
            }
        );
   
    set_fields (
        fields: jpath
        );

end

and then providing the simulation message text to both 5.2.4 and 5.1.10:

{"results":[{"gender":"male","name":{"title":"Mr","first":"Delfim","last":"da Mata"},"location":{"street":{"number":6337,"name":"Rua Amazonas "},"city":"Maricá","state":"Sergipe","country":"Brazil","postcode":26503,"coordinates":{"latitude":"-77.8234","longitude":"-163.0364"},"timezone":{"offset":"-8:00","description":"Pacific Time (US & Canada)"}},"email":"[email protected]","login":{"uuid":"442a18e2-46ea-4adc-abb1-7c387d2c9332","username":"sadbird543","password":"pressure","salt":"rz5MPZU6","md5":"a3af449b0d819686d52d1d1d19929feb","sha1":"de1a9d045e929d90206b8b4db519eaf798f9221d","sha256":"f4f3d147e7a5cacab7468c6a4fa076f1aff6acb6f1d8010760c020c552bfec57"},"dob":{"date":"1989-11-29T13:37:26.553Z","age":34},"registered":{"date":"2006-07-30T05:14:27.041Z","age":17},"phone":"(48) 1851-2196","cell":"(25) 3094-0127","id":{"name":"CPF","value":"186.187.318-05"},"picture":{"large":"https://randomuser.me/api/portraits/men/50.jpg","medium":"https://randomuser.me/api/portraits/med/men/50.jpg","thumbnail":"https://randomuser.me/api/portraits/thumb/men/50.jpg"},"nat":"BR"}],"info":{"seed":"1c15325c7b26a356","results":1,"page":1,"version":"1.4"}}

The simulation provides the expected output (path1:["sadbird543"]) on 5.1.10, but not 5.2.4.

With 5.2.4, the sample message had to be reformatted before the simulation would provide the expected output. I had to escape the JSON quotations and embed the message inside the parent JSON field message. The following sample message worked in the 5.2.4 rule simulator:

{"message":"{\"results\":[{\"gender\":\"male\",\"name\":{\"title\":\"Mr\",\"first\":\"Delfim\",\"last\":\"da Mata\"},\"location\":{\"street\":{\"number\":6337,\"name\":\"Rua Amazonas \"},\"city\":\"Maricá\",\"state\":\"Sergipe\",\"country\":\"Brazil\",\"postcode\":26503,\"coordinates\":{\"latitude\":\"-77.8234\",\"longitude\":\"-163.0364\"},\"timezone\":{\"offset\":\"-8:00\",\"description\":\"Pacific Time (US & Canada)\"}},\"email\":\"[email protected]\",\"login\":{\"uuid\":\"442a18e2-46ea-4adc-abb1-7c387d2c9332\",\"username\":\"sadbird543\",\"password\":\"pressure\",\"salt\":\"rz5MPZU6\",\"md5\":\"a3af449b0d819686d52d1d1d19929feb\",\"sha1\":\"de1a9d045e929d90206b8b4db519eaf798f9221d\",\"sha256\":\"f4f3d147e7a5cacab7468c6a4fa076f1aff6acb6f1d8010760c020c552bfec57\"},\"dob\":{\"date\":\"1989-11-29T13:37:26.553Z\",\"age\":34},\"registered\":{\"date\":\"2006-07-30T05:14:27.041Z\",\"age\":17},\"phone\":\"(48) 1851-2196\",\"cell\":\"(25) 3094-0127\",\"id\":{\"name\":\"CPF\",\"value\":\"186.187.318-05\"},\"picture\":{\"large\":\"https://randomuser.me/api/portraits/men/50.jpg\",\"medium\":\"https://randomuser.me/api/portraits/med/men/50.jpg\",\"thumbnail\":\"https://randomuser.me/api/portraits/thumb/men/50.jpg\"},\"nat\":\"BR\"}],\"info\":{\"seed\":\"1c15325c7b26a356\",\"results\":1,\"page\":1,\"version\":\"1.4\"}}"}

The change in the actual pipeline processing is unchanged, the JSON parsing rules that worked in 5.1.10 still function the same in 5.2.4, this appears to only be related to the simulation functionality in the pipeline rule source code editor.

Possible Solution

  • Make the sample message processing in the rule simulator more flexible
  • Allow the user to specify what type of sample message is being submitted

Steps to Reproduce (for bugs)

  1. Steps and sample data provided above

Context

This makes testing JSON-related pipeline rules with the pipeline rule editor verify difficult.

Your Environment

  • Graylog Version: 5.2.4
  • Java Version: `` `openjdk version "17.0.9" 2023-10-17 OpenJDK Runtime Environment (build 17.0.9+9-Ubuntu-120.04) OpenJDK 64-Bit Server VM (build 17.0.9+9-Ubuntu-120.04, mixed mode, sharing)
* OpenSearch Version: 1.3.7
* MongoDB Version:
* Operating System: Ubuntu 20.04.6 LTS
* Browser version: FIrefox 122.0.1

miwent avatar Feb 22 '24 14:02 miwent

Hello, I can confirm that I am experiencing the exact same issue as described above. I hope this will be fixed soon :-)

Astorias96 avatar Mar 01 '24 14:03 Astorias96

Hello, I can confirm that I am experiencing the exact same issue as described above. I hope this will be fixed soon :-)

In the meantime you can use the command line tool jq to convert the JSON into something should work:

echo '{"json_key_1":"json_value1","json_key_2":"json_value2"}' | jq -c --raw-input '{message: .}'

Just replace the JSON text in the first command with your JSON message. You can also change the message value in the last JQ command to use a different field name if needed. The output of this command should work in the 5.2.4 rule simulator.

miwent avatar Mar 01 '24 22:03 miwent

Tested in Beta-3. Issue exists the same.

jivepig avatar Mar 27 '24 17:03 jivepig

In 5.2 we enhanced the simulator input box to accept either:

  • the message field, expressed as a string
  • a complete log message, expressed as key-value pairs or JSON

Unfortunately, this results in the input being interpreted as JSON or KVP, when the user intended it to be a simple message field, with unexpected results.

I think we need to disambiguate this. E.g. there could be separate (mutually exclusive) input boxes for message field versus full messages.

patrickmann avatar Apr 02 '24 10:04 patrickmann

image

as discussed

tellistone avatar Apr 24 '24 08:04 tellistone

The issue is due to #17464 which introduces "smart" parsing and escaping of KVPs. This looks like it is entirely in the FE - BE handling for rule simulation hasn't changed since 5.1.

Assuming we add a UI toggle, we should

  • use the legacy behavior when user checks the simple message string option for rule simulator
  • use the current behavior when user selects JSON or KVP options

patrickmann avatar Apr 26 '24 13:04 patrickmann

This was intended for 6.0.1 but was not backported. Re-opening for backport.

patrickmann avatar May 14 '24 13:05 patrickmann

This can be closed again then, with the backport merged, right?

mako42 avatar May 28 '24 11:05 mako42

This can be closed again then, with the backport merged, right?

Correct, I will close it.

gally47 avatar May 28 '24 11:05 gally47