RFC: [C++][Java][FlightRPC] Substrait, transaction, cancellation for Flight SQL
Thanks for opening a pull request!
If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
See also:
Generally seems OK to me.
I've been pushing various tweaks to the spec while implementing it in C++.
One thing I will need to change: transaction IDs should be supplied when creating a prepared statement, not when executing them (since generally existing APIs associate the statement with a particular connection).
There's now implementations in C++ and Java.
TODOs:
- Integration tests in C++
- https://issues.apache.org/jira/browse/ARROW-17254
- Ensure integration tests work across languages
- Fix C++ API/Proto comments (methods should accept only Transaction values)
- Add wrapper types for new types in Java (stop using the Protobuf classes directly)
- Implement transactions in the Derby example (depends on #13710)
Hmm, Windows builds fail because of a similar issue to https://github.com/apache/arrow/pull/13434 - Protobuf and DLLs don't interact well, since you can't get protoc to insert the dllimport/dllexport declarations correctly.
The easiest thing might be to just punt on CancelQuery for now. Or else, it would have to be
message ActionCancelQueryRequest {
// XXX(ARROW-16902): A serialized FlightInfo
bytes info = 1;
}
and then rely on FlightInfo::Deserialize.
@lidavidm Since this is a draft, are you looking for a detailed review or more for general opinions?
The Protobuf definitions deserve more scrutiny; for the code, I'm just looking for general opinions.
I need to figure out the AppVeyor failure here, but this should generally be ready.
@jduo any further comments on the spec (or implementation) here?
I will fix ARROW-17254/ARROW-17420 and rebase this, but otherwise I think this is ready
ARROW-17420 was merged, rebased on top of PR for ARROW-17254 (#13898) and updated
@pitrou were you planning on taking a second look or were you just passing by? :slightly_smiling_face:
@zeroshade if you had any comments on the definitions before implementation, they would be appreciated!
@lidavidm I don't have time for a second look, unless you really need it :-)
No worries, just want to make sure!
Updated again, since @jvanstraten pointed out that the server may want to know the client's Substrait release version since otherwise it may be unclear how to interpret the plan (even if it parses properly).
Also, adds some validation for SqlInfo values to the integration test + adds SqlInfo values so the server can report Substrait version support.
(I guess these are mostly minor changes and we shouldn't need to restart the vote?)
Yes, definitely.
Also, sorry, the review contains C++ comments that I did some days/weeks ago but had forgotten to submit apparently :-S
Rebased + updated (minus the timeout since we'd have to change the result set schema for GetSqlInfo to add floating point)
https://issues.apache.org/jira/browse/ARROW-17688
Updated, thanks Antoine!
CI failures here are addressed/fixed elsewhere
Benchmark runs are scheduled for baseline = d571e93ad24d5111800540b42a3b8d56459edd9b and contender = 3ce40143f8a836df058ec5fe1b29d9da5ede169d. 3ce40143f8a836df058ec5fe1b29d9da5ede169d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2
[Failed :arrow_down:0.1% :arrow_up:0.0%] test-mac-arm
[Failed :arrow_down:0.28% :arrow_up:0.0%] ursa-i9-9960x
[Finished :arrow_down:0.21% :arrow_up:0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 3ce40143 ec2-t3-xlarge-us-east-2
[Failed] 3ce40143 test-mac-arm
[Failed] 3ce40143 ursa-i9-9960x
[Finished] 3ce40143 ursa-thinkcentre-m75q
[Finished] d571e93a ec2-t3-xlarge-us-east-2
[Failed] d571e93a test-mac-arm
[Failed] d571e93a ursa-i9-9960x
[Finished] d571e93a ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
['Python', 'R'] benchmarks have high level of regressions. test-mac-arm ursa-i9-9960x