arrow icon indicating copy to clipboard operation
arrow copied to clipboard

GH-37938: [Swift] initial impl of C Data interface

Open abandy opened this issue 1 year ago • 8 comments

Continuation for PR: #39091

This add an initial implementation of the C Data interface for swift. During development it was found that null count was not being properly maintained on the arrow buffers and this change is included as well. Also some minor refactoring was done to existing sources to enable this feature.

This has been tested from Swift calling into C to import data but not from Swift to C exporting data. Test is currently ongoing.

  • GitHub Issue: #37938

abandy avatar Apr 23 '24 00:04 abandy

:warning: GitHub issue #37938 has been automatically assigned in GitHub to PR creator.

github-actions[bot] avatar Apr 23 '24 00:04 github-actions[bot]

The go test for the C data interface are currently not included in CI testing. I plan to update this in a future PR.

abandy avatar Apr 23 '24 20:04 abandy

It’s great to add those interfaces to add Swift support.

Cc @andygrove @viirya

Thanks!

dbtsai avatar Apr 23 '24 23:04 dbtsai

It’s great to add those interfaces to add Swift support.

Cc @andygrove @viirya

Thanks!

Np, thank you!

abandy avatar Apr 24 '24 00:04 abandy

@dbtsai Are you familiar with Swift? If so, could you also review this carefully?

kou avatar Apr 24 '24 00:04 kou

@kou and @pitrou Please review when you get a chance. This is very similar to the previous PR(https://github.com/apache/arrow/pull/39091) for this enhancement that was closed but with the updates from @pitrou previous comments. Thank you!

abandy avatar Apr 28 '24 23:04 abandy

@kou I have made the requested changes. Please review again.

abandy avatar May 15 '24 13:05 abandy

@abandy Could you please rebase from git main, if not recently done, so that we get cleaner CI results?

pitrou avatar May 16 '24 14:05 pitrou

@pitrou and @kou I have made the requested changes. Please review again. I have 2 PRs waiting for some of theses changes (one is for adding struct type) and it would be great if we could get this merged so I can submit them. Thank you!

abandy avatar May 23 '24 19:05 abandy

In general, smaller PRs are easy to review/merge than one large PR. If you want to proceed quickly as much as possible, could you use the smaller PRs approach next time like you will do for struct type? For example, we might be able to choose the following approach for this case:

  1. Implement ArrowSchema export for 1 simple type such as bool.
  2. Implement ArrowSchema import for 1.
  3. Implement ArrowSchema export/import for one more simple type such as int8.
  4. ...

FYI: MATLAB uses the small PRs approach: https://github.com/apache/arrow/issues?q=is%3Aissue+matlab+%22C+data+interface%22

Hi @kou, thank you example on the smaller project. I will definitely follow that format for the struct type change! Thank you very much!

abandy avatar May 24 '24 00:05 abandy

+1

I added some minor comments but I'll merge this. If we need to work on these comments, let's work on them as follow-up tasks.

Thank you @kou, @pitrou, @dbtsai, @viirya for the reviews!!

I will create a MINOR pr for the comments added by @kou.

abandy avatar May 29 '24 16:05 abandy

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 4d524eb40161579c96d1ac11d8e296ab3507f889.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.