arrow icon indicating copy to clipboard operation
arrow copied to clipboard

ARROW-16690: [R][FlightRPC] Additional max_chunksize parameter in do_put method

Open thatstatsguy opened this issue 2 years ago • 5 comments

Summary An additional parameter in Flight do_put to specify chunk size in R.

Problem Currently, all data is sent through in a single message. It's a likely scenario that users will want the ability to control the batch sizes without building a custom do_put method.

Solution Additional (optional) parameter to specify chunk size.

thatstatsguy avatar May 30 '22 21:05 thatstatsguy

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

github-actions[bot] avatar May 30 '22 21:05 github-actions[bot]

https://issues.apache.org/jira/browse/ARROW-16690

github-actions[bot] avatar May 30 '22 21:05 github-actions[bot]

:warning: Ticket has not been started in JIRA, please click 'Start Progress'.

github-actions[bot] avatar May 30 '22 21:05 github-actions[bot]

@paleolimbot thanks for such a detailed review and prompts on how to resolve it - you were very kind :)

I've updated the linter, unit tests and rerun the devtools::document(). Assuming the unit test passes, the only issue remaining is the second comment on the implementation

thatstatsguy avatar Aug 10 '22 20:08 thatstatsguy

Awesome! The test looks great. I think ignoring max_chunksize with a warning is the way to go (you could error too...I think either is fine and I haven't actually used this function so I don't know which is more appropriate). Add a test for that and it's good to go!

paleolimbot avatar Aug 11 '22 00:08 paleolimbot

@paleolimbot thanks for your patience! All updated, assuming there are no build issues we should be good to go!

thatstatsguy avatar Aug 21 '22 18:08 thatstatsguy

Benchmark runs are scheduled for baseline = 5f84335fbb1e1467eed07563be00e9338a06ff03 and contender = 6d8624b02a47053190b93b649a48ac59da4dbf09. 6d8624b02a47053190b93b649a48ac59da4dbf09 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. Conbench compare runs links: [Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2 [Failed :arrow_down:0.27% :arrow_up:0.0%] test-mac-arm [Failed :arrow_down:0.0% :arrow_up:0.0%] ursa-i9-9960x [Finished :arrow_down:0.14% :arrow_up:0.0%] ursa-thinkcentre-m75q Buildkite builds: [Finished] 6d8624b0 ec2-t3-xlarge-us-east-2 [Failed] 6d8624b0 test-mac-arm [Failed] 6d8624b0 ursa-i9-9960x [Finished] 6d8624b0 ursa-thinkcentre-m75q [Finished] 5f84335f ec2-t3-xlarge-us-east-2 [Finished] 5f84335f test-mac-arm [Failed] 5f84335f ursa-i9-9960x [Finished] 5f84335f ursa-thinkcentre-m75q Supported benchmarks: ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True test-mac-arm: Supported benchmark langs: C++, Python, R ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

ursabot avatar Aug 23 '22 00:08 ursabot

The "R / AMD64 Windows R 4.2 RTools 42" CI job is failed by this commit:

https://github.com/apache/arrow/runs/7957718636?check_suite_focus=true#step:12:70

Error: Error: Not lint free
Warning: file=tests\testthat\test-python-flight.R,line=40,col=1,[trailing_whitespace_linter] Trailing whitespace is superfluous.
Warning: file=tests\testthat\test-python-flight.R,line=46,col=94,[trailing_whitespace_linter] Trailing whitespace is superfluous.

Could you remove trailing spaces from test-python-flight.R?

kou avatar Aug 23 '22 08:08 kou