graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Dots in field names are replaced silently

Open mpfz0r opened this issue 3 years ago • 11 comments

The fact that we don't support dots in field names is nothing new. This code has been there since ages, silently replacing dots with underscores once a Message gets written to ES/OS:

https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L404-L406

What is problematic, is that we neither have documentation for this behavior, nor do we log a warning if we replace dots.

The fact that field renaming happens very late in the processing can lead to some confusion when writing pipeline rules or extractors. The ingested Message will show up with underscore fields in the search, but within processing you need to work with dots.

Possible solutions

  • Add documentation
  • Show a system notification if we detect that changing field names was needed. (This needs to be done efficiently to not cause a performance degradation)
  • ~Change the field names directly when assigning them~ (This will break existing rules and might confuse users who are already accustomed to the existing behavior)

Related problems

Slash characters in field names are a similar case. They are not allowed in ES/OS. In this particular case we behave differently and ignore the field containing slash in its name, while ingesting the rest of the message. The problem has been mentioned in #12990, where we have introduced rate limited logging to inform about dropped fields with INFO level. It seems that we may need a single issue to solve all the problems related with special characters in field names, and this issue has been chosen for that.

Refs: https://github.com/Graylog2/graylog2-server/issues/12990 https://github.com/Graylog2/graylog2-server/issues/6588 https://github.com/Graylog2/graylog2-server/pull/5983

https://github.com/Graylog2/graylog2-server/issues/4583 https://github.com/elastic/elasticsearch/issues/15951

mpfz0r avatar Jul 13 '22 13:07 mpfz0r

[ HS #972457853 ]

mpfz0r avatar Jul 13 '22 13:07 mpfz0r

Mentioned on the case, but being able to debug($message) (or equivalent; giving a point-in-time view of the Message) would've helped greatly in figuring out what was going on.

coffee-squirrel avatar Jul 13 '22 13:07 coffee-squirrel

@coffee-squirrel Like this? https://github.com/Graylog2/graylog2-server/pull/13178

mpfz0r avatar Aug 02 '22 13:08 mpfz0r

@mpfz0r Nice; that'll be useful 👍 Not a big deal from our perspective, but some might look for being able to pass debug() a message parameter like certain other functions (e.g. has_field()).

coffee-squirrel avatar Aug 02 '22 14:08 coffee-squirrel

@coffee-squirrel Yeah, makes sense. I didn't do that in the beginning, because it required some changes to the rule parser. But it should work now.

mpfz0r avatar Aug 03 '22 08:08 mpfz0r

Can this get fixed. Since the new versions of elastic have . As standard now

OzzyKampha avatar Feb 06 '24 20:02 OzzyKampha