duckdb-r
duckdb-r copied to clipboard
Support arrow stream in duckdb
This issue arises from my question in Mastodon and the answer by @krlmlr.
For DBI to take full advantage of Arrow support in DuckDB, e.g. if the DuckDB table has been created with arrow::to_duckdb()
, the resulting lazy object would need its driver to transfer the arrow object without converting to a data frame first. I understand from @krlmlr that it is now converted to a data frame by duckdb::dbGetQueryArrow()
.
library(arrow, warn.conflicts = FALSE)
tmpfile <- tempfile(fileext = ".parquet")
beaver1 |>
write_parquet(tmpfile)
read_parquet(tmpfile, as_data_frame = FALSE) |>
to_duckdb()
#> # Source: table<arrow_001> [?? x 4]
#> # Database: DuckDB v0.10.2 [unknown@Linux 5.15.0-105-generic:R 4.4.0/:memory:]
#> day time temp activ
#> <dbl> <dbl> <dbl> <dbl>
#> 1 346 840 36.3 0
#> 2 346 850 36.3 0
#> 3 346 900 36.4 0
#> 4 346 910 36.4 0
#> 5 346 920 36.6 0
#> 6 346 930 36.7 0
#> 7 346 940 36.7 0
#> 8 346 950 36.8 0
#> 9 346 1000 36.8 0
#> 10 346 1010 36.9 0
#> # ℹ more rows
Created on 2024-05-08 with reprex v2.1.0
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.4.0 (2024-04-24)
#> os Linux Mint 21.3
#> system x86_64, linux-gnu
#> ui X11
#> language nl_BE:nl
#> collate nl_BE.UTF-8
#> ctype nl_BE.UTF-8
#> tz Europe/Brussels
#> date 2024-05-08
#> pandoc 3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> arrow * 15.0.1 2024-03-12 [3] RSPM (R 4.3.0)
#> assertthat 0.2.1 2019-03-21 [3] CRAN (R 4.0.1)
#> bit 4.0.5 2022-11-15 [3] RSPM (R 4.2.0)
#> bit64 4.0.5 2020-08-30 [3] RSPM (R 4.2.0)
#> cli 3.6.2 2023-12-11 [3] RSPM (R 4.3.0)
#> DBI 1.2.2 2024-02-16 [3] RSPM (R 4.3.0)
#> dbplyr 2.5.0 2024-03-19 [3] RSPM (R 4.3.0)
#> digest 0.6.35 2024-03-11 [3] RSPM (R 4.3.0)
#> dplyr 1.1.4 2023-11-17 [3] RSPM (R 4.3.0)
#> duckdb 0.10.2 2024-05-01 [3] RSPM (R 4.4.0)
#> evaluate 0.23 2023-11-01 [3] RSPM (R 4.3.0)
#> fansi 1.0.6 2023-12-08 [3] RSPM (R 4.3.0)
#> fastmap 1.1.1 2023-02-24 [3] RSPM (R 4.2.0)
#> fs 1.6.4 2024-04-25 [3] RSPM (R 4.3.0)
#> generics 0.1.3 2022-07-05 [3] RSPM (R 4.2.0)
#> glue 1.7.0 2024-01-09 [3] RSPM (R 4.3.0)
#> htmltools 0.5.8.1 2024-04-04 [3] RSPM (R 4.3.0)
#> knitr 1.46 2024-04-06 [3] RSPM (R 4.3.0)
#> lifecycle 1.0.4 2023-11-07 [3] RSPM (R 4.3.0)
#> magrittr 2.0.3 2022-03-30 [3] RSPM (R 4.2.0)
#> pillar 1.9.0 2023-03-22 [3] RSPM (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [3] CRAN (R 4.0.1)
#> purrr 1.0.2 2023-08-10 [3] RSPM (R 4.2.0)
#> R.cache 0.16.0 2022-07-21 [3] RSPM (R 4.2.0)
#> R.methodsS3 1.8.2 2022-06-13 [3] RSPM (R 4.2.0)
#> R.oo 1.26.0 2024-01-24 [3] RSPM (R 4.3.0)
#> R.utils 2.12.3 2023-11-18 [3] RSPM (R 4.3.0)
#> R6 2.5.1 2021-08-19 [3] RSPM (R 4.2.0)
#> reprex 2.1.0 2024-01-11 [3] RSPM (R 4.3.0)
#> rlang 1.1.3 2024-01-10 [3] RSPM (R 4.3.0)
#> rmarkdown 2.26 2024-03-05 [3] RSPM (R 4.3.0)
#> rstudioapi 0.16.0 2024-03-24 [3] RSPM (R 4.3.0)
#> sessioninfo 1.2.2 2021-12-06 [3] RSPM (R 4.2.0)
#> styler 1.10.3 2024-04-07 [3] RSPM (R 4.3.0)
#> tibble 3.2.1 2023-03-20 [3] RSPM (R 4.3.0)
#> tidyselect 1.2.1 2024-03-11 [3] RSPM (R 4.3.0)
#> utf8 1.2.4 2023-10-22 [3] RSPM (R 4.3.0)
#> vctrs 0.6.5 2023-12-01 [3] RSPM (R 4.3.0)
#> withr 3.0.0 2024-01-16 [3] RSPM (R 4.3.2)
#> xfun 0.43 2024-03-25 [3] RSPM (R 4.3.0)
#> yaml 2.3.8 2023-12-11 [3] RSPM (R 4.3.0)
#>
#> [1] /home/floris/lib/R/library
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
This might work out of the box with the very experimental https://github.com/krlmlr/duckdbneo/ .