duckdb-r icon indicating copy to clipboard operation
duckdb-r copied to clipboard

Support arrow stream in duckdb

Open florisvdh opened this issue 9 months ago • 1 comments

This issue arises from my question in Mastodon and the answer by @krlmlr.

For DBI to take full advantage of Arrow support in DuckDB, e.g. if the DuckDB table has been created with arrow::to_duckdb(), the resulting lazy object would need its driver to transfer the arrow object without converting to a data frame first. I understand from @krlmlr that it is now converted to a data frame by duckdb::dbGetQueryArrow().

library(arrow, warn.conflicts = FALSE)
tmpfile <- tempfile(fileext = ".parquet")
beaver1 |> 
  write_parquet(tmpfile)
read_parquet(tmpfile, as_data_frame = FALSE) |> 
  to_duckdb()
#> # Source:   table<arrow_001> [?? x 4]
#> # Database: DuckDB v0.10.2 [unknown@Linux 5.15.0-105-generic:R 4.4.0/:memory:]
#>      day  time  temp activ
#>    <dbl> <dbl> <dbl> <dbl>
#>  1   346   840  36.3     0
#>  2   346   850  36.3     0
#>  3   346   900  36.4     0
#>  4   346   910  36.4     0
#>  5   346   920  36.6     0
#>  6   346   930  36.7     0
#>  7   346   940  36.7     0
#>  8   346   950  36.8     0
#>  9   346  1000  36.8     0
#> 10   346  1010  36.9     0
#> # ℹ more rows

Created on 2024-05-08 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24)
#>  os       Linux Mint 21.3
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language nl_BE:nl
#>  collate  nl_BE.UTF-8
#>  ctype    nl_BE.UTF-8
#>  tz       Europe/Brussels
#>  date     2024-05-08
#>  pandoc   3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  arrow       * 15.0.1  2024-03-12 [3] RSPM (R 4.3.0)
#>  assertthat    0.2.1   2019-03-21 [3] CRAN (R 4.0.1)
#>  bit           4.0.5   2022-11-15 [3] RSPM (R 4.2.0)
#>  bit64         4.0.5   2020-08-30 [3] RSPM (R 4.2.0)
#>  cli           3.6.2   2023-12-11 [3] RSPM (R 4.3.0)
#>  DBI           1.2.2   2024-02-16 [3] RSPM (R 4.3.0)
#>  dbplyr        2.5.0   2024-03-19 [3] RSPM (R 4.3.0)
#>  digest        0.6.35  2024-03-11 [3] RSPM (R 4.3.0)
#>  dplyr         1.1.4   2023-11-17 [3] RSPM (R 4.3.0)
#>  duckdb        0.10.2  2024-05-01 [3] RSPM (R 4.4.0)
#>  evaluate      0.23    2023-11-01 [3] RSPM (R 4.3.0)
#>  fansi         1.0.6   2023-12-08 [3] RSPM (R 4.3.0)
#>  fastmap       1.1.1   2023-02-24 [3] RSPM (R 4.2.0)
#>  fs            1.6.4   2024-04-25 [3] RSPM (R 4.3.0)
#>  generics      0.1.3   2022-07-05 [3] RSPM (R 4.2.0)
#>  glue          1.7.0   2024-01-09 [3] RSPM (R 4.3.0)
#>  htmltools     0.5.8.1 2024-04-04 [3] RSPM (R 4.3.0)
#>  knitr         1.46    2024-04-06 [3] RSPM (R 4.3.0)
#>  lifecycle     1.0.4   2023-11-07 [3] RSPM (R 4.3.0)
#>  magrittr      2.0.3   2022-03-30 [3] RSPM (R 4.2.0)
#>  pillar        1.9.0   2023-03-22 [3] RSPM (R 4.2.0)
#>  pkgconfig     2.0.3   2019-09-22 [3] CRAN (R 4.0.1)
#>  purrr         1.0.2   2023-08-10 [3] RSPM (R 4.2.0)
#>  R.cache       0.16.0  2022-07-21 [3] RSPM (R 4.2.0)
#>  R.methodsS3   1.8.2   2022-06-13 [3] RSPM (R 4.2.0)
#>  R.oo          1.26.0  2024-01-24 [3] RSPM (R 4.3.0)
#>  R.utils       2.12.3  2023-11-18 [3] RSPM (R 4.3.0)
#>  R6            2.5.1   2021-08-19 [3] RSPM (R 4.2.0)
#>  reprex        2.1.0   2024-01-11 [3] RSPM (R 4.3.0)
#>  rlang         1.1.3   2024-01-10 [3] RSPM (R 4.3.0)
#>  rmarkdown     2.26    2024-03-05 [3] RSPM (R 4.3.0)
#>  rstudioapi    0.16.0  2024-03-24 [3] RSPM (R 4.3.0)
#>  sessioninfo   1.2.2   2021-12-06 [3] RSPM (R 4.2.0)
#>  styler        1.10.3  2024-04-07 [3] RSPM (R 4.3.0)
#>  tibble        3.2.1   2023-03-20 [3] RSPM (R 4.3.0)
#>  tidyselect    1.2.1   2024-03-11 [3] RSPM (R 4.3.0)
#>  utf8          1.2.4   2023-10-22 [3] RSPM (R 4.3.0)
#>  vctrs         0.6.5   2023-12-01 [3] RSPM (R 4.3.0)
#>  withr         3.0.0   2024-01-16 [3] RSPM (R 4.3.2)
#>  xfun          0.43    2024-03-25 [3] RSPM (R 4.3.0)
#>  yaml          2.3.8   2023-12-11 [3] RSPM (R 4.3.0)
#> 
#>  [1] /home/floris/lib/R/library
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

florisvdh avatar May 08 '24 07:05 florisvdh

This might work out of the box with the very experimental https://github.com/krlmlr/duckdbneo/ .

krlmlr avatar Jul 04 '24 04:07 krlmlr