elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

[ML] Switch to ECS Grok patterns in text structure finder and categorization

Open droberts195 opened this issue 4 years ago • 1 comments

#76885 introduced the possibility of using ECS Grok patterns instead of the legacy ones.

We should switch to using these in the text structure plugin and for the Grok patterns we add to categorization results.

It looks like some new date formats exist in the latest set of Grok patterns - certainly for newer versions of Tomcat and Catalina, possibly others - we should add those to the timestamp format finder too.

  • [x] Add ecs_compatibility option to _text_structure/find_structure endpoint, default disabled, and change that endpoint to use ECS Grok patterns if it's set to v1. This may also necessitate making the timestamp format finder aware of two different Grok patterns per timestamp format, and then having it use the appropriate one depending on whether ECS Grok patterns are in use (investigation required).
  • [ ] Change UI to set ecs_compatibility to v1 when calling _text_structure/find_structure. https://github.com/elastic/kibana/issues/138428
  • [x] Have a look through the ECS Grok patterns that were added in #76885 and see if there are any new timestamp formats that didn't exist in the original Grok patterns. Maybe Tomcat and Catalina have some new ones, maybe others. If any are found add configs for them to the timestamp format finder in _text_structure/find_structure.
  • [ ] Change the Grok pattern creator for _ml/anomaly_detectors/<job_id>/results/categories to always use ECS Grok patterns - this change can be made unconditionally without keeping a BWC option for the old Grok patterns, as the functionality is experimental.

droberts195 avatar Aug 31 '21 13:08 droberts195

Pinging @elastic/ml-core (Team:ML)

elasticmachine avatar Aug 31 '21 13:08 elasticmachine

All tasks complete now - closing

droberts195 avatar Feb 15 '23 09:02 droberts195