jena
jena copied to clipboard
REGEX Expression Evaluation errors can flood the logs
Version
4.x
What happened?
User report on the mailing list - https://lists.apache.org/thread/4kfpx9wpm38p3kbdh78dmgpoz2ckm4s1
A query with a faulty REGEX
on a large dataset resulted in massive log output because every single attempt to evaluate the expression produces an ExprEvalException
and QueryIterFilterExpr
logs every single one of those
Workaround suggested on list is to disable the offending logger but since this is enabled by default it effectively provides a potential DoS vector against Jena based systems. Logging could be made more intelligent in several ways:
- Suppressing duplicate messages
- Not logging specific classes of expression evaluation failures
- Logging at a lower level that would not be output by default
Relevant output and stacktrace
See mailing list thread - https://lists.apache.org/thread/91qvxshm4njnd657g966yrbq8kmsy9ok
Are you interested in making a pull request?
Yes
See mailing list. User query is suspect.
This is because the regex is in a variable (and wrong) and may change every evaluation - so some can be right, some can be wrong.
Usually, regex patterns are constants and compiled once. And so one log message per evaluation.
We could consider not outputting the stacktrace but be careful of changing away from ExprException because of the more important static case.