graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Allow slash characters in Message keys

Open mpfz0r opened this issue 3 years ago • 1 comments

What?

Messages currently only accept keys that are matching this regex:

https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L195

Anything else will be silently discarded This by itself is a separate problem.

Why?

Some logs contain / in their keys. a prominent example would be kubernetes. E.g. app.kubernetes.io/name

There is also no pipeline rule that can easily replace characters on all fields.

As far as I can tell, there shouldn't be a problem with / field names in ES or Lucene queries either.

mpfz0r avatar Jul 01 '22 13:07 mpfz0r

[ HS# 966769785 ]

mpfz0r avatar Jul 01 '22 13:07 mpfz0r

Hi @mpfz0r , I just wanted to mention that by default (with standard analyzer in ES), slash will be treated as token separator.

GET _analyze
{
"analyzer": "standard",
"text": ["app.kubernetes.io/name"]
}
{
  "tokens": [
    {
      "token": "app.kubernetes.io",
      "start_offset": 0,
      "end_offset": 17,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "name",
      "start_offset": 18,
      "end_offset": 22,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

Are we for sure ok with that???

luk-kaminski avatar Oct 20 '23 07:10 luk-kaminski

@luk-kaminski

I just wanted to mention that by default (with standard analyzer in ES), slash will be treated as token separator.

But we are talking about field names. Are those analyzed?

mpfz0r avatar Oct 20 '23 07:10 mpfz0r

Oh, sorry, it was my misunderstanding...

Then, based on https://www.elastic.co/guide/en/ecs/current/ecs-guidelines.html#_guidelines_for_field_names, namely No special characters except underscore, there is a problem with / field names in ES.

luk-kaminski avatar Oct 20 '23 07:10 luk-kaminski

As far as I can tell, there shouldn't be a problem with / field names in ES or Lucene queries either.

Been to long, maybe I tested this. The alternative would be to somehow inform the user that they can't use special chars in fields names.

mpfz0r avatar Oct 20 '23 08:10 mpfz0r

Notes from a talk with Marco. If we do not find any clever solution, we could at least try to log messages with INFO level here: https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L567, limiting the rate with com.swrve.ratelimitedlogger.RateLimitedLog. With this approach, the clients would be at least aware of what is going on.

luk-kaminski avatar Oct 20 '23 08:10 luk-kaminski

The solution of the main problem will be solved (or considered) as part of #13043. For now, partial improvement with rate limited log is the only improvement that has been introduced here.

luk-kaminski avatar Jan 16 '24 09:01 luk-kaminski