GH-46734: [C++] Arrow Flight SQL ODBC layer in Windows
Rationale for this change
- Build Arrow Flight SQL ODBC DLL from
flightsqlandodbcabstractioncomponent. - The ODBC driver will be available in Windows only.
What changes are included in this PR?
- Add handle allocation for environment and connection.
- Add environment attribute setting
Are these changes tested?
Yes, on local Windows environment.
Are there any user-facing changes?
N/A
This PR addresses the following issues:
- #46734
- #46004
- #46096
- #46097
- #46574
- #46575
- GitHub Issue: #46098
- GitHub Issue: #30622
- GitHub Issue: #46734
:warning: GitHub issue #46098 has been automatically assigned in GitHub to PR creator.
@aiguofer I have opened a draft PR for the ODBC layer.
@zeroshade , @assignUser , @ianmcook Follow-up on the May-07 Arrow meeting, this is the draft PR that Rob and I have for creating the initial connection in the ODBC layer. It is based on top of @jbonofre 's PR#40939. Some changes in this PR are planned to be moved to PR#40939, such as the vcpkg.json update.
@alinaliBQ why not just using the original PR/branch ? Just to not loose changes/comments.
@alinaliBQ why not just using the original PR/branch ? Just to not loose changes/comments.
Yup you're right, for flightsql-odbc contribution, folks should use the apache#40939 PR, and that PR has higher priority for review than apache#46099. After apache#40939 PR is merged, we can go back to look at this apache#46099 PR.
Hello @lidavidm , this is the draft PR that is based on top of @jbonofre 's https://github.com/apache/arrow/pull/40939. It's still a draft PR but just wanted to give you a heads up
Hi @lidavidm @kou and other interested maintainers, please let me know if you prefer the Arrow Flight SQL ODBC contribution PRs like this one to be split into smaller PRs. It would be nice to know in advance so my team can prepare for the split, thank you!
Yes. I prefer smaller PRs to larger PRs because smaller PRs are easier to review than larger PRs.
(BTW, I'm not sure whether this PR is a smaller PR or not...)
Yes, smaller PRs would be preferable. (To be frank, I have yet to actually review the first PR in full, but I'd rather get the paperwork done and get it merged ASAP, then we can go over it in detail afterwards.)
@kou @lidavidm Ok sounds good! I will rebase Bit-Quill:apache-odbc on top of apache:main branch after https://github.com/apache/arrow/pull/40939 is merged, so the size of this PR will be smaller compared to what it shows right now. Then I plan to open new smaller PRs to split this PR.
:warning: GitHub issue #30622 has been automatically assigned in GitHub to PR creator.
Hi David @lidavidm and Kou @kou, there's been a change regarding my last comment, we (Rob @rscales & I) will still work on splitting the PR into smaller ones and look into the flightsql-odbc comments from David on https://github.com/apache/arrow/pull/40939 (and any other comments for Arrow ODBC Flight SQL), but that will be after we finish developing the ODBC layer to have the full functionality.
This PR#40939 will stay in draft state until we finish developing the ODBC driver, but any changes in this PR are ready for community to review.
TL;DR, we would like to finish the ODBC layer development first, and we will begin addressing the community comments after that.
Please let me know if you have any questions. :)
OK - sounds good, please ping me when that is ready for review
Could you fix a build error https://github.com/apache/arrow/issues/46576 before the ODBC layer development?
@kou Yes, our team can look into https://github.com/apache/arrow/issues/46576
:warning: GitHub issue #46734 has been automatically assigned in GitHub to PR creator.
Hi @lidavidm @kou we are close to start working on community code review comments to prepare for merging the PR. Since this PR is quite big, here is my proposal:
- create new PRs that extract smaller commits from this PR to merge to
main, then rebase this PR #46099 onto main after smaller new PR is merged, and repeat the process until this PR #46099 is small enough to be merged tomaindirectly
Does this approach look good to you?
Feel free to tag other maintainers who may be interested in reviewing the code.
Yes. In general, I prefer smaller PRs. :-)
Yes, this is nearly impossible to review as is :sweat_smile:
Sounds good, our team will be starting to create smaller PRs, please review these smaller PRs. Once they are created, I will post the PR links here for visibility.
https://github.com/apache/arrow/pull/47517 has been created
Hi @lidavidm @kou, I got a question for you guys. 😃 I am considering opening a series of draft PRs that contain ODBC API implementation to get initial feedback and code review comments earlier. Currently we have around 40 ODBC APIs and each PR generally will have 1-2 APIs where it makes sense, so it will be many PRs raised. These ODBC API PRs still need to be merged consecutively to get the tests passing so that's why we are making them drafts. This approach helps us to get the ODBC driver available to the community sooner.
Are you folks good with this approach and okay with reviewing the ODBC API draft PRs? Please let me know your thoughts.
Yes. Let's use the approach.
Sounds good. When the draft PRs are ready for a review, the team will tag folks as before
Thanks both for your feedback. I will be pausing the ODBC draft PR creation and our team will prioritize on the existing comments. I was excited from creating the PRs and got a little carried away. 😄 List of PRs created this week:
Hi @lidavidm @kou Thanks for reviewing the PRs! Once we address the ODBC test review comments that are universal to the driver, do you think it is okay for us to open more new ODBC API draft PRs? :slightly_smiling_face: If yes, we will be creating about 20 more new draft PRs in total, and we would likely be raising several draft PRs a day.
Each day before we raise new PRs, we will first try to check to see if the review comments apply to rest of the driver, and the team will prioritize the comments that are universal to the driver. But if the review comment is only applicable for one draft PR, we will plan to work on it later if that's alright.
I hope this makes it easier for reviewing: If folks see any general comments on our draft PRs, feel free to mention that it is a general comment, and our team will work on these on all applicable PRs. Please don't feel the need to post the general comments on all the applicable PRs.
For in-progress PRs that don't have all known general comments addressed, we will label them with WIP.
Thanks for helping to review/merge our PRs!
Hello, Here is a list of draft PRs that can be reviewed:
- https://github.com/apache/arrow/pull/48032
- https://github.com/apache/arrow/pull/48033
- https://github.com/apache/arrow/pull/48050
- https://github.com/apache/arrow/pull/48034
- https://github.com/apache/arrow/pull/48035
- https://github.com/apache/arrow/pull/48036
- https://github.com/apache/arrow/pull/48037
- https://github.com/apache/arrow/pull/48040
- https://github.com/apache/arrow/pull/48041
- https://github.com/apache/arrow/pull/48042
- https://github.com/apache/arrow/pull/48043
- https://github.com/apache/arrow/pull/48051
- https://github.com/apache/arrow/pull/48052
- https://github.com/apache/arrow/pull/48053
- https://github.com/apache/arrow/pull/48031
- https://github.com/apache/arrow/pull/48020
- https://github.com/apache/arrow/pull/48054
Hmm, it seems draft PRs don't ping me, sorry! If you think the PR is fully ready please un-draft it so I get pinged for review
Hmm, it seems draft PRs don't ping me, sorry! If you think the PR is fully ready please un-draft it so I get pinged for review
@lidavidm no worries! I didn't know that as well. I will ping you for the draft PRs that are ready to be reviewed so you can get notification, thank you for letting me know. My team will work on all of the comments as we see them.
The changes in the draft PRs can be reviewed, but they are not ready to be merged yet, and I have been undrafting PRs that are ready to be merged. The reason is that ODBC APIs need to be merged in a specific order. For example, handles need to be supported before implementing connectivity, and query execution needs to be implemented after connectivity. If the PRs are merged out of order, the tests are not guaranteed to pass and can be hard to debug later on, which is what I am trying to avoid. Let me know if you have any questions/suggestions 🙂