Make nested fields from Grok patterns that use square bracket syntax
In the reference Grok implementation that comes from Logstash, fields referenced in patterns may use a square bracket notation to indicate when a nested field should be created to hold a parsed value. This is not yet supported in Zed's grok() function.
Details
On the Elastic page for the Grok filter plugin for Logstash, it shows in passing a syntax that uses square brackets to store a parsed value in a nested field (e.g., %{GREEDYDATA:[nested][field][test]} to store in {"nested": {"field": {"test": ... }}}). It turns out this is using a general Logstash concept for how nested fields can be referenced, i.e., not Grok-specific. However, it's syntax that's used in some of the examples out on the Internet that users may find as they're learning Grok with intent to apply it in Zed.
As of commit 48021d7, Zed's grok implementation stores values in fields with names that contain dots, such that downstream use of nest_dotted could turn these into nested fields if desired, e.g.,
$ zq -version
Version: v1.17.0-35-g48021d77
$ echo '"2024-08-14T19:12:51.123456789Z Hello world!"' |
zq -Z 'yield grok("%{TIMESTAMP_ISO8601:t.timestamp} %{GREEDYDATA:nested.field}", this)' -
{
"t.timestamp": "2024-08-14T19:12:51.123456789Z",
"nested.field": "Hello world!"
}
$ echo '"2024-08-14T19:12:51.123456789Z Hello world!"' |
zq -Z 'yield grok("%{TIMESTAMP_ISO8601:t.timestamp} %{GREEDYDATA:nested.field}", this) | yield nest_dotted(this)' -
{
t: {
timestamp: "2024-08-14T19:12:51.123456789Z"
},
nested: {
field: "Hello world!"
}
}
However, if we want to make it easier for users to repurpose Logstash Grok configs they find via web searches, we may want to add support the bracket syntax.