calcite [CALCITE-6731] Support bitwise XOR (^) operator in SQL

Adds support for the bitwise XOR (^) operator in SQL expressions.

This includes:

Introducing BITXOR_OPERATOR to support ^ as a symbolic alias for the existing BITXOR() function
Extending the parser to recognize the ^ symbol
Registering the operator in SqlStdOperatorTable
Adding unit tests to verify parsing and evaluation

Note: Calcite uses ^ as a test caret marker to indicate parser positions in SQL test strings.
To avoid conflicts, a whitelist-based workaround is applied in test cases to bypass caret parsing logic when ^ is intended as an operator.

Apr 27 '25 09:04 Dwrite

@Dwrite please ensure that the Commit info in the PR is consistent with the Jira summary. If the PR has implemented "<<", please add relevant descriptions in Jira.

Apr 28 '25 02:04 NobiGo

@NobiGo hi, thanks for your time. already add relevant descriptions to Jira. I have currently implemented support for the ^ (bitwise XOR) and << (bitwise left shift) operators. However, due to the testing framework using {^} as a parser position. I have only added a parser test for it, and I need find new test for it. do you have some suggestions ?

Apr 28 '25 15:04 Dwrite

@NobiGo hi, thanks for your time. already add relevant descriptions to Jira. I have currently implemented support for the ^ (bitwise XOR) and << (bitwise left shift) operators. However, due to the testing framework using {^} as a parser position. I have only added a parser test for it, and I need find new test for it. do you have some suggestions ?

I haven't carefully read the implementation logic of ^ in the test cases. I wonder if there are escape characters to distinguish it?

May 10 '25 00:05 NobiGo

The code to parse the string and carets is hand-written, the right solution is indeed to adapt it to treat a double caret as an escaped caret. (or some other similar escaping mechanism.)

May 10 '25 00:05 mihaibudiu

@NobiGo Is it OK to add some ^ test cases using a whitelist?

May 31 '25 09:05 Dwrite

What is the status of this PR?

Jun 16 '25 17:06 mihaibudiu

@mihaibudiu Thanks for checking in! The current status is that the PR adds support for the ^, <<, and & operators, as well as the RightShift function. I've added complete test cases for all of them except for ^, which is currently tested using a whitelist-based approach, since writing exhaustive tests for it is relatively complex.

Could you help review whether this approach for ^ is acceptable?

Jun 17 '25 14:06 Dwrite

The PR title does not match the issue; please edit the issue. In general, I think it's preferable to solve one problem in one PR, so probably this one should be 3 separate PRs (and 3 issues). But I will review it as it is.

Jun 18 '25 01:06 mihaibudiu

@mihaibudiu
Thanks for the suggestion. I've split this PR as requested.

This PR now only supports ^ as per CALCITE-6731

Jun 23 '25 08:06 Dwrite

Could you also add a few tests in a quidem file (suffix .iq)? The reason I am asking this is because all the tests you wrote involve only constants, so they will be optimized by the compiler. For quidem tests that read from a table the evaluation path is slightly different.

which file are you refering to ? I could not find the suffix.iq file.

Jul 21 '25 12:07 Dwrite

$ find . -name *.iq
./core/src/test/resources/sql/measure-paper.iq
./core/src/test/resources/sql/unnest.iq
./core/src/test/resources/sql/sequence.iq
./core/src/test/resources/sql/planner.iq
./core/src/test/resources/sql/unsigned.iq
./core/src/test/resources/sql/winagg.iq
./core/src/test/resources/sql/agg.iq
...

Jul 21 '25 16:07 mihaibudiu

Thank you for the feedback and for reviewing the PR!

I've added the corresponding .iq test case to demonstrate the issue and verify the fix.

To clarify: the reason idempotency is broken is not due to ^^ sequences — those are already correctly handled by the parser. The issue lies in this specific logic:

else if (secondCaret < 0) {
  String sqlSansCaret =
      sql.substring(0, firstCaret)
          + sql.substring(firstCaret + 1);
  ...
}

This block assumes that a single caret ^ always indicates an error marker, and thus removes it unconditionally. However, in valid SQL expressions such as SELECT 2 ^ 3, the caret is a legitimate binary operator (bitwise XOR). As a result, when running QuidemTest, the caret is stripped out, breaking the SQL statement and causing the test to fail on subsequent runs — violating idempotency.

The proposed fix adds a minimal check to distinguish between ^ used as part of an error marker and ^ used as a valid SQL character. This ensures tests like SELECT 2 ^ 3 remain valid and idempotent across runs.

While escaping all carets manually (using ^^) would work, it introduces unnecessary complexity and reduces readability — especially for contributors unfamiliar with this nuance. This fix aims for a cleaner and more robust solution that maintains natural SQL syntax in tests.

Thanks again for your time and review — much appreciated!

Jul 22 '25 04:07 Dwrite

I think that doubling a caret is something much simpler for contributors to do to write tests than to understand the rules about when a single caret works and when a double one is required.

Jul 22 '25 22:07 mihaibudiu

I think that doubling a caret is something much simpler for contributors to do to write tests than to understand the rules about when a single caret works and when a double one is required.

I agree that doubling carets is simple in principle — and I’ve tried that approach. Unfortunately, it still breaks idempotency in some cases, especially when Quidem interprets and rewrites the input/output. That’s why I went for a solution that preserves the original SQL and ensures the tests remain stable across runs.

Jul 23 '25 08:07 Dwrite

I don't understand what Quidem has to do with carets. Maybe I am missing something; can you explain what the problem is with an example? I still don't know what is the idempotency you are mentioning. What is the operation that has to be applied twice and produce the same results?

Jul 23 '25 16:07 mihaibudiu

Quidem has to do with carets.

You're right — Quidem has nothing to do with carets. The issue I mentioned is unrelated to Quidem itself. When using ^^ (double caret), I've seen that repeated evaluation or type checking (e.g. inside f.checkScalar) can produce inconsistent results. This breaks idempotency in tests, since the same expression might yield different results across passes. That's why I avoided using ^^ even though it works syntactically.

Jul 24 '25 02:07 Dwrite

Hi mihaibudiu, all checks have passed. May I ask if there's anything missing or any concerns that are blocking approval at this point? I'd be happy to address them. Thanks!

Jul 24 '25 02:07 Dwrite

Can you show me a test that has problems with double carets? Would be good to understand what is going on

Jul 24 '25 03:07 mihaibudiu

java.lang.RuntimeException: Error while parsing query: values (2 ^^ 3)

at org.apache.calcite.sql.test.AbstractSqlTester.parseAndValidate(AbstractSqlTester.java:158)
at org.apache.calcite.sql.test.AbstractSqlTester.check(AbstractSqlTester.java:240)
at org.apache.calcite.sql.test.SqlTester.check(SqlTester.java:158)
at org.apache.calcite.test.SqlOperatorFixtureImpl.lambda$checkScalar$2(SqlOperatorFixtureImpl.java:240)
at org.apache.calcite.sql.test.AbstractSqlTester.forEachQuery(AbstractSqlTester.java:454)
at org.apache.calcite.test.SqlOperatorFixtureImpl.checkScalar(SqlOperatorFixtureImpl.java:239)
at org.apache.calcite.sql.test.SqlOperatorFixture.checkScalar(SqlOperatorFixture.java:232)
at org.apache.calcite.test.SqlOperatorTest.testBitXorOperatorScalarFunc(SqlOperatorTest.java:16468)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at

.... Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "^ ^" at line 1, column 11.

Hi,

Just to clarify, Quidem replaces ^^ with a single caret ^ before executing the tests. So in test inputs, we need to write ^^ to represent an actual caret.

The issue is that this replacement happens multiple times during test processing, which means the caret count reduces each time the test runs. This breaks idempotency, because the SQL changes after the first run, causing failures or unexpected behavior on subsequent runs.

That’s why relying solely on doubling carets (^^) isn’t an ideal solution. We need a way to handle single carets properly to ensure consistent behavior across multiple executions.

Jul 24 '25 03:07 Dwrite

I think this signals a bug in the testing framework, I will take a look at this.

Jul 24 '25 23:07 mihaibudiu

The problem is the following: the testing framework uses a "consumer" to check the results, with the same API used for negative and positive test cases. There is an implicit assumption that only negative tests may have carets in the SQL strings. So when a positive test is executed, the code paths for handling escaped carets are never executed. I will see whether I can fix this.

Jul 25 '25 00:07 mihaibudiu

The problem is the following: the testing framework uses a "consumer" to check the results, with the same API used for negative and positive test cases. There is an implicit assumption that only negative tests may have carets in the SQL strings. So when a positive test is executed, the code paths for handling escaped carets are never executed. I will see whether I can fix this.

Got it, thanks again for the explanation! Just to confirm — would you suggest I hold off on this PR until the testing framework is updated, or is there anything you'd prefer me to adjust in the meantime?

Jul 25 '25 03:07 Dwrite

Let’s wait a bit. Thank you

Jul 25 '25 05:07 mihaibudiu

Let’s wait a bit. Thank you

Sure, thanks! Just to mention — I’ve added the previously added << operator in a separate PR: https://github.com/apache/calcite/pull/4478.

Jul 25 '25 10:07 Dwrite

I have pushed an extra commit which should fix the problems of the test framework with carets. It would be nice if someone else than me could review that. It was a bit messy to cleanup.

(I have also removed your custom caret parsing function)

sure. I just asked the [xuzifu666](https://github.com/xuzifu666) to review

Jul 27 '25 04:07 Dwrite

I have pushed an extra commit which should fix the problems of the test framework with carets. It would be nice if someone else than me could review that. It was a bit messy to cleanup. (I have also removed your custom caret parsing function)

sure. I just asked the [xuzifu666](https://github.com/xuzifu666) to review

I think these changes are fine.

Jul 28 '25 02:07 xuzifu666

Please squash the commits so we can merge. If it's not too much work, you can leave my changes in a separate commit - but that may be difficult.

Thanks for the review! I've squashed my commits and kept your fixes in a separate commit as suggested.

Jul 30 '25 03:07 Dwrite

I think there is a linter rule which will prohibit you from having a commit message starting with "Fix". You can try to run this locally with ./gradlew build.

Jul 30 '25 16:07 mihaibudiu

If you will change the commits, maybe you can combine the two changes from the first two commits into a single one and use that message.

Jul 30 '25 16:07 mihaibudiu

If you will change the commits, maybe you can combine the two changes from the first two commits into a single one and use that message. sure. Just squash the commits to one commit.

Jul 31 '25 03:07 Dwrite