arrow icon indicating copy to clipboard operation
arrow copied to clipboard

ARROW-17980: [C++] As-of-Join Substrait extension

Open rtpsw opened this issue 3 years ago • 7 comments
trafficstars

See https://issues.apache.org/jira/browse/ARROW-17980

rtpsw avatar Oct 12 '22 12:10 rtpsw

cc @icexelloss @westonpace

rtpsw avatar Oct 12 '22 12:10 rtpsw

Note that this PR includes code from https://github.com/apache/arrow/pull/14386 which is currently pending.

rtpsw avatar Oct 12 '22 13:10 rtpsw

@rtpsw Why do we need #14386 for this? Can we separate out these two changes?

icexelloss avatar Oct 12 '22 13:10 icexelloss

@westonpace The changes here depends on your local change in https://github.com/westonpace/arrow/tree/experiment/substrait-extension

How do you want to proceed here? We can either (1) merge changes in experiment/substrait-extension and rebase this PR (2) combine changes in experiment/substrait-extension and the asof join message together in this PR and review together

icexelloss avatar Oct 12 '22 13:10 icexelloss

@rtpsw Why do we need #14386 for this? Can we separate out these two changes?

Technically, we don't, or no longer do. I used #14386 during test cases development of this PR. If #14386 will be rejected, we could remove its code here too. Otherwise, the code can stay here and will merge cleanly with #14386.

rtpsw avatar Oct 12 '22 13:10 rtpsw

@westonpace The changes here depends on your local change in https://github.com/westonpace/arrow/tree/experiment/substrait-extension

The code here uses most of Weston's code, except for mostly that the "dummy" DelayRel was replaced with AsOfJoinRel (including corresponding test code) and that the code was refactored a bit. So, I think the code here should go in without a merge or a rebase.

rtpsw avatar Oct 12 '22 13:10 rtpsw

https://issues.apache.org/jira/browse/ARROW-17980

github-actions[bot] avatar Oct 12 '22 15:10 github-actions[bot]

@westonpace, any idea why some jobs, like this one, failed? It seems to be complaining about missing proto classes; I don't know why the proto classes would get built in some jobs and not in others.

rtpsw avatar Oct 13 '22 09:10 rtpsw

@rtpsw I took one around of review. At the high level I think this makes lot of sense. Left some comments for refinement.

icexelloss avatar Oct 13 '22 14:10 icexelloss

CI failures look unrelated the changes here. @icexelloss, this may be ready for an Arrow review, WDYT?

rtpsw avatar Oct 15 '22 08:10 rtpsw

@rtpsw Looks pretty good! There seems to be many style changes that seems orthogonal. Are those changes automatically made? Is it easy to separate out the style change from the actual change?

icexelloss avatar Oct 17 '22 14:10 icexelloss

@rtpsw Looks pretty good! There seems to be many style changes that seems orthogonal. Are those changes automatically made? Is it easy to separate out the style change from the actual change?

If we insist, we could take these style changes out to a separate PR, get that merged, and then come back here to rebase.

rtpsw avatar Oct 17 '22 15:10 rtpsw

@rtpsw Looks pretty good! There seems to be many style changes that seems orthogonal. Are those changes automatically made? Is it easy to separate out the style change from the actual change?

If we insist, we could take these style changes out to a separate PR, get that merged, and then come back here to rebase.

Anja suggested changing arrow::engine::substrait to arrow::engine::substrait_ext to avoid MSVC name conflicts with symbols in ::substrait. This should allow us to keep using substrait:: instead of ::substrait:: and avoid a large orthogonal change.

rtpsw avatar Oct 19 '22 10:10 rtpsw

@rtpsw Looks pretty good! There seems to be many style changes that seems orthogonal. Are those changes automatically made? Is it easy to separate out the style change from the actual change?

If we insist, we could take these style changes out to a separate PR, get that merged, and then come back here to rebase.

Anja suggested changing arrow::engine::substrait to arrow::engine::substrait_ext to avoid MSVC name conflicts with symbols in ::substrait. This should allow us to keep using substrait:: instead of ::substrait:: and avoid a large orthogonal change.

Does arrow::engine::substrait::extension work?

icexelloss avatar Oct 19 '22 12:10 icexelloss

Does arrow::engine::substrait::extension work?

My guess is this would confuse MSVC just the same because when it sees a substrait::* symbol it would look under arrow::engine::substrait and won't fall back to ::substrait.

rtpsw avatar Oct 19 '22 12:10 rtpsw

@rtpsw Looks pretty good! There seems to be many style changes that seems orthogonal. Are those changes automatically made? Is it easy to separate out the style change from the actual change?

@icexelloss, I managed to minimize the orthogonal change (namespace substrait = ::substrait added to a few files). I think this should be enough to make reviewing convenient.

rtpsw avatar Oct 23 '22 10:10 rtpsw

The commit history got messed up. I'll try to open a fresh PR.

rtpsw avatar Oct 24 '22 10:10 rtpsw

@rtpsw You don't need to open a new PR - you can just force push to your branch ARROW-17980 and fix the commit history

icexelloss avatar Oct 24 '22 12:10 icexelloss