arrow
arrow copied to clipboard
ARROW-18012: [R] Make map_batches .lazy = TRUE by default
This makes the default map_batches()
behaviour lazy (i.e., the function is called once per batch as each batch arrives):
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
source <- RecordBatchReader$create(
record_batch(a = 1:10),
record_batch(a = 11:20)
)
mapped <- map_batches(source, function(x) {
message("Hi! I'm being evaluated!")
x
}, .schema = source$schema)
as_arrow_table(mapped)
#> Hi! I'm being evaluated!
#> Hi! I'm being evaluated!
#> Table
#> 20 rows x 1 columns
#> $a <int32>
Created on 2022-10-26 with reprex v2.0.2
This was previously a confusing default since piping the resulting RecordBatchReader
into an ExecPlan
would fail for some ExecPlans before ARROW-17178 (#13706). This PR commits to the (more optimal/expected) lazy behaviour.
https://issues.apache.org/jira/browse/ARROW-18012
:warning: Ticket has not been started in JIRA, please click 'Start Progress'.
Benchmark runs are scheduled for baseline = 286c263492860bf6d62b3e39c80147b787848020 and contender = 97076308d07e447ad52fd4fa026f8d92513b98c9. 97076308d07e447ad52fd4fa026f8d92513b98c9 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] ec2-t3-xlarge-us-east-2
[Failed :arrow_down:0.0% :arrow_up:0.0%] test-mac-arm
[Finished :arrow_down:0.0% :arrow_up:0.0%] ursa-i9-9960x
[Finished :arrow_down:0.21% :arrow_up:0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 97076308
ec2-t3-xlarge-us-east-2
[Failed] 97076308
test-mac-arm
[Finished] 97076308
ursa-i9-9960x
[Finished] 97076308
ursa-thinkcentre-m75q
[Finished] 286c2634
ec2-t3-xlarge-us-east-2
[Failed] 286c2634
test-mac-arm
[Finished] 286c2634
ursa-i9-9960x
[Finished] 286c2634
ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java