arrow
arrow copied to clipboard
ARROW-17377: [C++][Docs] Adds tutorial for basic Arrow, file access, compute, and datasets
I intend for this PR to add a few small tutorial articles to the Arrow documentation, for basic Arrow usage, file access, compute, and dataset functionality.
Right now, this is a draft PR, with just the code for the examples. Before I set it up with comments and prose in Sphinx, I wanted to get it reviewed. Do these examples seem suitable for the tutorials they target?
https://issues.apache.org/jira/browse/ARROW-17377
A main operation for Arrow compute is the CallFunction()
, you showcase it in an example, but probably it's worthwhile to have a small example demonstrating how to call a function with and without FunctionOptions
.
A main operation for Arrow compute is the
CallFunction()
, you showcase it in an example, but probably it's worthwhile to have a small example demonstrating how to call a function with and withoutFunctionOptions
.
I can definitely agree with that. I don't want to do too deep of a dive, but this can be such a jarring experience (from my own time working in Arrow) that I'd want to know sooner rather than later.
Force-pushed because I messed up a rebase and things would've been bad if I didn't just reset it.
I've drafted up some words to go with the arrow_example.cc code here: https://docs.google.com/document/d/14lIhxzqWbh6IBYKe_GXHPvNURBg90eD-W2nz7l4S-nQ/edit?usp=sharing
Before I start getting this into rst form, I want to give people a chance to review it -- it's in suggestion mode, so comments and suggested modifications can be made.
In return for the week without comment, here's the prose for the other three articles in this PR:
File I/O doc: https://docs.google.com/document/d/1Zcx_5kYgqnAyAkmtSTLBBxuzwbiRIy3DUh---Q3RbTk/edit?usp=sharing Compute doc: https://docs.google.com/document/d/1zJUfBDvd0NRWW9NuGIMx-VqDC8pHD9YOyejQrTLsEdg/edit?usp=sharing Datasets doc: https://docs.google.com/document/d/1S8qswJ-jpZmTuDJA_VCPSLatmWccU06ZN9cyE59M0Ns/edit?usp=sharing
Once more, these are all in Suggestion mode. Once everything looks good, I'll put their contents into RST, and we can see about finishing up this PR.
thanks for these! I'll try to take a look when I get a chance (…and get you that Flight tutorial sketch as well)
@github-actions crossbow submit preview-docs
Unable to match any tasks for `preview-docs`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2959866334
@github-actions crossbow submit preview-docs
Revision: 80b68fee84f4bdbb7ba90d03f39f505999cdbfb9
Submitted crossbow builds: ursacomputing/crossbow @ actions-4ccb463279
Task | Status |
---|---|
preview-docs |
https://crossbow.voltrondata.com/pr_docs/13859/
(for my own reference, not sure if there's an accessible link elsewhere)
@github-actions crossbow submit preview-docs
Revision: c0b0b568f978466d3a81cc5de0f03325d5af2d3b
Submitted crossbow builds: ursacomputing/crossbow @ actions-8251273d95
Task | Status |
---|---|
preview-docs |
@github-actions crossbow submit preview-docs
Revision: 641bf26be0afd264259324dc63c118926edf6935
Submitted crossbow builds: ursacomputing/crossbow @ actions-66b2e5b7f0
Task | Status |
---|---|
preview-docs |
@github-actions crossbow submit preview-docs
Revision: c1e482ee1aed51c7c3dd7a185d510317c2f0cfac
Submitted crossbow builds: ursacomputing/crossbow @ actions-fa3bdf6378
Task | Status |
---|---|
preview-docs |
With the exception of a few links, the latest built preview is good for some review passes.
I've read through and left a few more suggestions.
I'm not sure if this issue is real, but noticed that in some pages of the preview (compute tutorial in particular), many of the class references are red (unresolved links). Perhaps there is something wrong there?
I actually forgot to ask when posting this, but I also noticed that compute simply refused to resolve. I do not know why, when it is the same namespaces I can effectively use in code. Does anyone have any ideas?
@github-actions crossbow submit preview-docs
Revision: c11657f7652892f5d6225d00900c6962b97eeddb
Submitted crossbow builds: ursacomputing/crossbow @ actions-66df26602c
Task | Status |
---|---|
preview-docs |
@pitrou pinging by request.
@github-actions crossbow submit cpp-tutorial-example
Revision: e8eed47900c8a4b4a9fbce34c7d7742cb6b595fc
Submitted crossbow builds: ursacomputing/crossbow @ actions-934a32a7a0
Task | Status |
---|---|
cpp-tutorial-example |
As for the references, I think we don't actually add the Sphinx directive to put the docstrings into the actual docs for each of the individual functions, so Sphinx doesn't know about them
CI failures look to be unrelated.
Benchmark runs are scheduled for baseline = 8ecb73015560498fc28b9fe498b3568296e3f4ab and contender = 8ad5e59803dda03ff5b829ae1635bbe301a1a4e4. 8ad5e59803dda03ff5b829ae1635bbe301a1a4e4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2
[Failed :arrow_down:0.07% :arrow_up:0.03%] test-mac-arm
[Failed :arrow_down:0.0% :arrow_up:0.0%] ursa-i9-9960x
[Finished :arrow_down:0.14% :arrow_up:0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 8ad5e598
ec2-t3-xlarge-us-east-2
[Failed] 8ad5e598
test-mac-arm
[Failed] 8ad5e598
ursa-i9-9960x
[Finished] 8ad5e598
ursa-thinkcentre-m75q
[Finished] 8ecb7301
ec2-t3-xlarge-us-east-2
[Failed] 8ecb7301
test-mac-arm
[Failed] 8ecb7301
ursa-i9-9960x
[Finished] 8ecb7301
ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
['Python', 'R'] benchmarks have high level of regressions. test-mac-arm