jena icon indicating copy to clipboard operation
jena copied to clipboard

REGEX Expression Evaluation errors can flood the logs

Open rvesse opened this issue 1 year ago • 1 comments

Version

4.x

What happened?

User report on the mailing list - https://lists.apache.org/thread/4kfpx9wpm38p3kbdh78dmgpoz2ckm4s1

A query with a faulty REGEX on a large dataset resulted in massive log output because every single attempt to evaluate the expression produces an ExprEvalException and QueryIterFilterExpr logs every single one of those

Workaround suggested on list is to disable the offending logger but since this is enabled by default it effectively provides a potential DoS vector against Jena based systems. Logging could be made more intelligent in several ways:

  • Suppressing duplicate messages
  • Not logging specific classes of expression evaluation failures
  • Logging at a lower level that would not be output by default

Relevant output and stacktrace

See mailing list thread - https://lists.apache.org/thread/91qvxshm4njnd657g966yrbq8kmsy9ok

Are you interested in making a pull request?

Yes

rvesse avatar Mar 29 '23 12:03 rvesse

See mailing list. User query is suspect.

This is because the regex is in a variable (and wrong) and may change every evaluation - so some can be right, some can be wrong.

Usually, regex patterns are constants and compiled once. And so one log message per evaluation.

We could consider not outputting the stacktrace but be careful of changing away from ExprException because of the more important static case.

afs avatar Mar 29 '23 12:03 afs