couchbase-lite-core CBL-5522: Port - N1QL Parser has exponential slowdown for redundant parentheses

Mar 18 '24 14:03 callumbirks

Context of why the change fixes something?

Mar 21 '24 01:03 borrrden

@borrrden For expressions with many redundant layers of parentheses (i.e. (((expr)))) N1QL parser performance decreases exponentially, the more levels of parentheses. An expression that takes 60ms with 1 pair of parentheses, takes several minutes with 5. The reason is because the parser operates recursively, and previously the only expression which captured parentheses is at the very end of that recursive chain, at _baseExpr. So for every level of parentheses, the parser would go through expr9, 8, 7, etc until reaching _baseExpr, strip one level of parentheses, then start at the top again.

This change strips all but 1 pair of parentheses from each expression, at the very top-level. So all unnecessary recursion due to parentheses is avoided.

Mar 22 '24 10:03 callumbirks

Have you compared with this test, "N1QL Performance"/"N1QL benchmark"?

Mar 25 '24 17:03 jianminzhao

Have you compared with this test, "N1QL Performance"/"N1QL benchmark"?

@jianminzhao No? I don't believe that's relevant? For the cases of redundant parentheses, I saw an expression like SELECT * FROM _ WHERE ((((((type == '7')))))) reduced from several minutes to milliseconds. FYI this fix is already in 3.1.7 also, this is just a port.

Mar 27 '24 11:03 callumbirks

@jianminzhao See previous PR https://github.com/couchbase/couchbase-lite-core/pull/1983 Jens was also happy with this change.

Mar 27 '24 11:03 callumbirks

Code Coverage Results:

Type	Percentage
branches	69.85
functions	80.42
instantiations	36.84
lines	80.37
regions	76.97

Mar 27 '24 12:03 cbl-bot

@jianminzhao I ran the test out of curiosity. This issue doesn't really show with the small amount (4) nested parentheses in the test. I changed it to 6 nested, I get this in master:

Elapsed time/check time = 2.52361/0.5

And in this branch:

Elapsed time/check time = 0.000412333/0.5

Mar 27 '24 12:03 callumbirks

With 4 nested, in master:

Elapsed time/check time = 0.026721/0.5

In this branch:

Elapsed time/check time = 0.000403917/0.5

Mar 27 '24 12:03 callumbirks

The purpose of that test is to catch regression in performance. PEG has quirky thing that some rules may affect performance gravely. The degradation can show up in as simple as several level of parenthesis. How does your new rule affect that test? Will it be run the test just as fast no matter how many levels of parenthesis?

Mar 27 '24 16:03 jianminzhao

@jianminzhao Exactly yes. Because it removes all the parentheses at the top level, without recursion, the performance impact of the number of parentheses is now negligible.

Mar 27 '24 17:03 callumbirks

That is not the purpose of that test. The purpose of the test is to discover performance degradation of other grammar rules. Now, this test becomes meaningless.

Mar 28 '24 18:03 jianminzhao

The problem is solved though, so what is the issue? If you want to write more tests to try other potentially performance impacting queries that's fine.

Mar 29 '24 02:03 borrrden

The purpose of this commit is to improve the performance by removing unnecessary parenthesis. The purpose of this particular test is to magnify the performance degradation of additional grammar rule. In short, this test can be removed with this commit.

Mar 29 '24 22:03 jianminzhao

couchbase-lite-core couchbase-lite-core copied to clipboard

CBL-5522: Port - N1QL Parser has exponential slowdown for redundant parentheses

couchbase-lite-core
couchbase-lite-core copied to clipboard