rule_tree error handling is missing a logger has a not helpful error message
It is possible run into an exception when a rule_tree is wrongly configured.
Expected behavior Logprep compains already on startup that the rule_tree_config is not correct, and the rule_tree should have a logger in case other unforeseen errors happen.
Current behavior
In an older logprep version one could see following logline, if the rule was not configured as a separate file, but defined directly inside the pipeline.yaml
RuleTree WARNING : Error parsing rule "None.yml": TypeError: '<' not supported between instances of 'int' and 'str'. Ignore and continue with next rule.
This is not very helpful as it is not easily recognizable which rule or rule_tree is producing this error. In the current main branch though this doesn't even happen anymore as the rule_tree doesn't have a logger resulting in:
[...]
File "[...]/logprep/framework/rule_tree/rule_tree.py", line 125, in add_rule
logger.warning(
^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'warning'
Steps to reproduce
Create the following pipeline.yml:
input:
dummy:
type: dummy_input
documents: []
output:
dummy:
type: dummy_output
pipeline:
- grokker:
type: grokker
specific_rules:
- filter: 'tags: "foo bar" AND message: bi'
grokker:
mapping:
message: "%{GREEDYDATA:foo}"
generic_rules: []
tree_config: tconfig.json
Create the following tree_config named tconfig.json (as referenced inside the pipeline.yml):
{
"priority_dict": {
"tag": 1,
"message": 2
},
"tag_map": {
"field_name_to_check_for_in_rule": "TAG-TO-CHECK-IF-IN-EVENT"
}
}
Notice that the priority_dict uses integer values instead of string values.
Run logprep with:
logprep run pipeline.yml
Environment
Logprep version: 856ceaf8 Python version: 3.11.0
Possible solution
- Implement the tree config as attrs class, with well defined expected types, and load them on logprep start up.
- Add a logger to the rule_tree