datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

feat: Support Binary bitwise shift operators (<< and >>)

Open ovr opened this issue 3 years ago • 8 comments

Which issue does this PR close?

This PR introduces support for binary bitwise shift operators like >> and <<.

Are there any user-facing changes?

It's new functionality, and there are no breaking changes.

Thanks!

ovr avatar Aug 04 '22 17:08 ovr

Right now, DF uses Generic Dialect for sqlparser, but it's allowed only for PostgreSqlDialect

Token::ShiftLeft if dialect_of!(self is PostgreSqlDialect) => {
    Some(BinaryOperator::PGBitwiseShiftLeft)
}
Token::ShiftRight if dialect_of!(self is PostgreSqlDialect) => {
    Some(BinaryOperator::PGBitwiseShiftRight)
}

@alamb what should I do with it? Should I Introduce new methods for Dialect to allow these operators with Generic dialect?

Thanks

ovr avatar Aug 04 '22 18:08 ovr

@alamb what should I do with it? Should I Introduce new methods for Dialect to allow these operators with Generic dialect?

What about allowing BinaryOperator::PGBitwiseShiftRight in GenericDialect ? I think the GenericDialect in sqlparser is sort of a "anything goes" dialect so it should be fine to extend it to allow these operators.

I can make another sql-parser release soon if you need one as well

alamb avatar Aug 04 '22 18:08 alamb

@alamb what should I do with it? Should I Introduce new methods for Dialect to allow these operators with Generic dialect?

What about allowing BinaryOperator::PGBitwiseShiftRight in GenericDialect ? I think the GenericDialect in sqlparser is sort of a "anything goes" dialect so it should be fine to extend it to allow these operators.

I can make another sql-parser release soon if you need one as well

LGTM, if we can merge the sql-rs first

liukun4515 avatar Aug 05 '22 09:08 liukun4515

Looks likes I forget about overflowing here. I think I can use overflowing_shl to handle it correctly to the Error https://doc.rust-lang.org/std/primitive.i64.html#method.overflowing_shl

❯ select 2 << 1024;
thread 'main' panicked at 'attempt to shift left with overflow', datafusion/physical-expr/src/expressions/binary.rs:524:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

ovr avatar Aug 05 '22 14:08 ovr

Panic on overflow was fixed in 474ed5a

Without error to be similar with PostgreSQL

image

ovr avatar Aug 05 '22 15:08 ovr

https://crates.io/crates/sqlparser/0.20.0 is released, FYI

alamb avatar Aug 06 '22 14:08 alamb

Awaiting https://github.com/apache/arrow-datafusion/pull/3072

ovr avatar Aug 08 '22 11:08 ovr

Awaiting https://github.com/apache/arrow-datafusion/pull/3072

Is merged 🎉

alamb avatar Aug 08 '22 21:08 alamb

Codecov Report

Merging #3037 (810f4b0) into master (48f9b7a) will decrease coverage by 0.03%. The diff coverage is 63.15%.

@@            Coverage Diff             @@
##           master    #3037      +/-   ##
==========================================
- Coverage   85.95%   85.91%   -0.04%     
==========================================
  Files         291      291              
  Lines       52382    52475      +93     
==========================================
+ Hits        45025    45084      +59     
- Misses       7357     7391      +34     
Impacted Files Coverage Δ
...on/physical-expr/src/expressions/binary/kernels.rs 52.17% <44.44%> (-10.99%) :arrow_down:
datafusion/physical-expr/src/expressions/binary.rs 97.71% <93.10%> (-0.17%) :arrow_down:
datafusion/core/tests/sql/expr.rs 99.84% <100.00%> (+<0.01%) :arrow_up:
datafusion/expr/src/binary_rule.rs 84.61% <100.00%> (ø)
datafusion/expr/src/operator.rs 96.00% <100.00%> (+0.16%) :arrow_up:
datafusion/sql/src/planner.rs 81.96% <100.00%> (+0.01%) :arrow_up:
datafusion/core/src/physical_plan/metrics/value.rs 86.93% <0.00%> (-0.51%) :arrow_down:
datafusion/expr/src/logical_plan/plan.rs 77.60% <0.00%> (-0.18%) :arrow_down:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Aug 11 '22 21:08 codecov-commenter

Marked PR as ready for review ✅

ovr avatar Aug 12 '22 12:08 ovr

Can we update the user guide to show this new functionality? This can be a follow-on issue.

andygrove avatar Aug 15 '22 14:08 andygrove

@andygrove

Can we update the user guide to show this new functionality? This can be a follow-on issue.

https://github.com/apache/arrow-datafusion/blob/master/docs/source/user-guide/sql/sql_status.md#supported-sql

I didn't find any place where I should update the documentation, as I can see there is a small comment about binary expressions, but it doesn't include all binary expressions.

Many mathematical unary and binary expressions such as +, /, sqrt, tan, >=.

I updated the description on this PR to reference https://github.com/apache/arrow-datafusion/issues/1619

ovr avatar Aug 15 '22 14:08 ovr

Benchmark runs are scheduled for baseline = 5ddad472acbbe2073aa6ac901ee5115f5830cc0f and contender = ee55d89cbf20f4a6d17fe399c72d60dca6d67912. ee55d89cbf20f4a6d17fe399c72d60dca6d67912 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. Conbench compare runs links: [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2 [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q Buildkite builds: Supported benchmarks: ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True test-mac-arm: Supported benchmark langs: C++, Python, R ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

ursabot avatar Aug 15 '22 17:08 ursabot