Allow slash characters in Message keys
What?
Messages currently only accept keys that are matching this regex:
https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L195
Anything else will be silently discarded This by itself is a separate problem.
Why?
Some logs contain / in their keys. a prominent example would be kubernetes.
E.g. app.kubernetes.io/name
There is also no pipeline rule that can easily replace characters on all fields.
As far as I can tell, there shouldn't be a problem with / field names in ES or Lucene queries either.
[ HS# 966769785 ]
Hi @mpfz0r , I just wanted to mention that by default (with standard analyzer in ES), slash will be treated as token separator.
GET _analyze
{
"analyzer": "standard",
"text": ["app.kubernetes.io/name"]
}
{
"tokens": [
{
"token": "app.kubernetes.io",
"start_offset": 0,
"end_offset": 17,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "name",
"start_offset": 18,
"end_offset": 22,
"type": "<ALPHANUM>",
"position": 1
}
]
}
Are we for sure ok with that???
@luk-kaminski
I just wanted to mention that by default (with standard analyzer in ES), slash will be treated as token separator.
But we are talking about field names. Are those analyzed?
Oh, sorry, it was my misunderstanding...
Then, based on https://www.elastic.co/guide/en/ecs/current/ecs-guidelines.html#_guidelines_for_field_names, namely No special characters except underscore, there is a problem with / field names in ES.
As far as I can tell, there shouldn't be a problem with / field names in ES or Lucene queries either.
Been to long, maybe I tested this. The alternative would be to somehow inform the user that they can't use special chars in fields names.
Notes from a talk with Marco.
If we do not find any clever solution, we could at least try to log messages with INFO level here:
https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L567,
limiting the rate with com.swrve.ratelimitedlogger.RateLimitedLog.
With this approach, the clients would be at least aware of what is going on.
The solution of the main problem will be solved (or considered) as part of #13043. For now, partial improvement with rate limited log is the only improvement that has been introduced here.