[CALCITE-6731] Support bitwise XOR (^) operator in SQL
Adds support for the bitwise XOR (^) operator in SQL expressions.
This includes:
- Introducing
BITXOR_OPERATORto support^as a symbolic alias for the existingBITXOR()function - Extending the parser to recognize the
^symbol - Registering the operator in
SqlStdOperatorTable - Adding unit tests to verify parsing and evaluation
This addresses CALCITE-6731.
Note: Calcite uses ^ as a test caret marker to indicate parser positions in SQL test strings.
To avoid conflicts, a whitelist-based workaround is applied in test cases to bypass caret parsing logic when ^ is intended as an operator.
@Dwrite please ensure that the Commit info in the PR is consistent with the Jira summary. If the PR has implemented "<<", please add relevant descriptions in Jira.
@NobiGo hi, thanks for your time. already add relevant descriptions to Jira. I have currently implemented support for the ^ (bitwise XOR) and << (bitwise left shift) operators. However, due to the testing framework using {^} as a parser position. I have only added a parser test for it, and I need find new test for it. do you have some suggestions ?
@NobiGo hi, thanks for your time. already add relevant descriptions to Jira. I have currently implemented support for the ^ (bitwise XOR) and << (bitwise left shift) operators. However, due to the testing framework using {^} as a parser position. I have only added a parser test for it, and I need find new test for it. do you have some suggestions ?
I haven't carefully read the implementation logic of ^ in the test cases. I wonder if there are escape characters to distinguish it?
The code to parse the string and carets is hand-written, the right solution is indeed to adapt it to treat a double caret as an escaped caret. (or some other similar escaping mechanism.)
@NobiGo Is it OK to add some ^ test cases using a whitelist?
What is the status of this PR?
@mihaibudiu Thanks for checking in! The current status is that the PR adds support for the ^, <<, and & operators, as well as the RightShift function. I've added complete test cases for all of them except for ^, which is currently tested using a whitelist-based approach, since writing exhaustive tests for it is relatively complex.
Could you help review whether this approach for ^ is acceptable?
The PR title does not match the issue; please edit the issue. In general, I think it's preferable to solve one problem in one PR, so probably this one should be 3 separate PRs (and 3 issues). But I will review it as it is.
@mihaibudiu
Thanks for the suggestion. I've split this PR as requested.
- This PR now only supports
^as per CALCITE-6731
Could you also add a few tests in a quidem file (suffix .iq)? The reason I am asking this is because all the tests you wrote involve only constants, so they will be optimized by the compiler. For quidem tests that read from a table the evaluation path is slightly different.
which file are you refering to ? I could not find the suffix.iq file.
$ find . -name *.iq
./core/src/test/resources/sql/measure-paper.iq
./core/src/test/resources/sql/unnest.iq
./core/src/test/resources/sql/sequence.iq
./core/src/test/resources/sql/planner.iq
./core/src/test/resources/sql/unsigned.iq
./core/src/test/resources/sql/winagg.iq
./core/src/test/resources/sql/agg.iq
...
Thank you for the feedback and for reviewing the PR!
I've added the corresponding .iq test case to demonstrate the issue and verify the fix.
To clarify: the reason idempotency is broken is not due to ^^ sequences — those are already correctly handled by the parser. The issue lies in this specific logic:
else if (secondCaret < 0) {
String sqlSansCaret =
sql.substring(0, firstCaret)
+ sql.substring(firstCaret + 1);
...
}
This block assumes that a single caret ^ always indicates an error marker, and thus removes it unconditionally. However, in valid SQL expressions such as SELECT 2 ^ 3, the caret is a legitimate binary operator (bitwise XOR). As a result, when running QuidemTest, the caret is stripped out, breaking the SQL statement and causing the test to fail on subsequent runs — violating idempotency.
The proposed fix adds a minimal check to distinguish between ^ used as part of an error marker and ^ used as a valid SQL character. This ensures tests like SELECT 2 ^ 3 remain valid and idempotent across runs.
While escaping all carets manually (using ^^) would work, it introduces unnecessary complexity and reduces readability — especially for contributors unfamiliar with this nuance. This fix aims for a cleaner and more robust solution that maintains natural SQL syntax in tests.
Thanks again for your time and review — much appreciated!
I think that doubling a caret is something much simpler for contributors to do to write tests than to understand the rules about when a single caret works and when a double one is required.
I think that doubling a caret is something much simpler for contributors to do to write tests than to understand the rules about when a single caret works and when a double one is required.
I agree that doubling carets is simple in principle — and I’ve tried that approach. Unfortunately, it still breaks idempotency in some cases, especially when Quidem interprets and rewrites the input/output. That’s why I went for a solution that preserves the original SQL and ensures the tests remain stable across runs.
I don't understand what Quidem has to do with carets. Maybe I am missing something; can you explain what the problem is with an example? I still don't know what is the idempotency you are mentioning. What is the operation that has to be applied twice and produce the same results?
Quidem has to do with carets.
You're right — Quidem has nothing to do with carets. The issue I mentioned is unrelated to Quidem itself. When using ^^ (double caret), I've seen that repeated evaluation or type checking (e.g. inside f.checkScalar) can produce inconsistent results. This breaks idempotency in tests, since the same expression might yield different results across passes. That's why I avoided using ^^ even though it works syntactically.
Hi mihaibudiu, all checks have passed. May I ask if there's anything missing or any concerns that are blocking approval at this point? I'd be happy to address them. Thanks!
Can you show me a test that has problems with double carets? Would be good to understand what is going on
java.lang.RuntimeException: Error while parsing query: values (2 ^^ 3)
at org.apache.calcite.sql.test.AbstractSqlTester.parseAndValidate(AbstractSqlTester.java:158)
at org.apache.calcite.sql.test.AbstractSqlTester.check(AbstractSqlTester.java:240)
at org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:158)
at org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkScalar$2(SqlOperatorFixtureImpl.java:240)
at org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(AbstractSqlTester.java:454)
at org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(SqlOperatorFixtureImpl.java:239)
at org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(SqlOperatorFixture.java:232)
at org.apache.calcite.test.SqlOperatorTest.testBitXorOperatorScalarFunc(SqlOperatorTest.java:16468)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at
.... Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "^ ^" at line 1, column 11.
Hi,
Just to clarify, Quidem replaces ^^ with a single caret ^ before executing the tests. So in test inputs, we need to write ^^ to represent an actual caret.
The issue is that this replacement happens multiple times during test processing, which means the caret count reduces each time the test runs. This breaks idempotency, because the SQL changes after the first run, causing failures or unexpected behavior on subsequent runs.
That’s why relying solely on doubling carets (^^) isn’t an ideal solution. We need a way to handle single carets properly to ensure consistent behavior across multiple executions.
I think this signals a bug in the testing framework, I will take a look at this.
The problem is the following: the testing framework uses a "consumer" to check the results, with the same API used for negative and positive test cases. There is an implicit assumption that only negative tests may have carets in the SQL strings. So when a positive test is executed, the code paths for handling escaped carets are never executed. I will see whether I can fix this.
The problem is the following: the testing framework uses a "consumer" to check the results, with the same API used for negative and positive test cases. There is an implicit assumption that only negative tests may have carets in the SQL strings. So when a positive test is executed, the code paths for handling escaped carets are never executed. I will see whether I can fix this.
Got it, thanks again for the explanation! Just to confirm — would you suggest I hold off on this PR until the testing framework is updated, or is there anything you'd prefer me to adjust in the meantime?
Let’s wait a bit. Thank you
Let’s wait a bit. Thank you
Sure, thanks! Just to mention — I’ve added the previously added << operator in a separate PR: https://github.com/apache/calcite/pull/4478.
I have pushed an extra commit which should fix the problems of the test framework with carets. It would be nice if someone else than me could review that. It was a bit messy to cleanup.
(I have also removed your custom caret parsing function)
sure. I just asked the [xuzifu666](https://github.com/xuzifu666) to review
I have pushed an extra commit which should fix the problems of the test framework with carets. It would be nice if someone else than me could review that. It was a bit messy to cleanup. (I have also removed your custom caret parsing function)
sure. I just asked the
[xuzifu666](https://github.com/xuzifu666)to review
I think these changes are fine.
Please squash the commits so we can merge. If it's not too much work, you can leave my changes in a separate commit - but that may be difficult.
Thanks for the review! I've squashed my commits and kept your fixes in a separate commit as suggested.
I think there is a linter rule which will prohibit you from having a commit message starting with "Fix". You can try to run this locally with ./gradlew build.
If you will change the commits, maybe you can combine the two changes from the first two commits into a single one and use that message.
If you will change the commits, maybe you can combine the two changes from the first two commits into a single one and use that message. sure. Just squash the commits to one commit.