TileDB-R icon indicating copy to clipboard operation
TileDB-R copied to clipboard

fragment index argument refers to C++ array (starts at 0)

Open cgiachalis opened this issue 2 years ago • 2 comments

To return a fragment info value for a given index, the fragment index argument fid expects the index to start at 0. For example, if you've n fragments, you need to pass n-1 to get the information for the n fragment. This relates to functions like tiledb_fragment_info_get_timestamp_range(object, fid), tiledb_fragment_info_get_timestamp_range(finfo, fid) etc.

The following example shows how the issue came up initially.

library(tiledb)

uri <- tempfile()

# ingest #1
df1 <- data.frame(x = 1:3, tm = Sys.Date() + 1:3)
fromDataFrame(df1, uri, mode = "ingest")

Sys.sleep(5)

# ingest #2
df2 <- data.frame(x = 4:6, tm = Sys.Date() + 4:6)
fromDataFrame(df2, uri, mode = "append")

# Construct tiledb_fragment_info object
finfo <- tiledb_fragment_info(uri)

# Return  number of fragments
fid <- tiledb_fragment_info_get_num(finfo) 
# 2

# Get time-stamp range for a given fragment index, here the last fragment
tiledb_fragment_info_get_timestamp_range(finfo, fid)
#> Error in libtiledb_fragment_info_timestamp_range(object@ptr, fid): [TileDB::FragmentInfo] Error: Cannot get fragment URI; Invalid fragment index

# Adjust fragment index
tiledb_fragment_info_get_timestamp_range(finfo, fid - 2) # 0 index
#> [1] "2023-02-04 18:34:01 +03" "2023-02-04 18:34:01 +03"
tiledb_fragment_info_get_timestamp_range(finfo, fid - 1) # 1 index
#> [1] "2023-02-04 18:34:06 +03" "2023-02-04 18:34:06 +03"
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31 ucrt)
#>  os       Windows 10 x64 (build 19044)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United Kingdom.utf8
#>  ctype    English_United Kingdom.utf8
#>  tz       Europe/Istanbul
#>  date     2023-02-04
#>  pandoc   2.19.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  bit           4.0.5   2022-11-15 [1] CRAN (R 4.2.2)
#>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.2.0)
#>  cli           3.6.0   2023-01-09 [1] CRAN (R 4.2.2)
#>  digest        0.6.31  2022-12-11 [1] CRAN (R 4.2.2)
#>  evaluate      0.20    2023-01-17 [1] CRAN (R 4.2.2)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
#>  fs            1.6.0   2023-01-23 [1] CRAN (R 4.2.2)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.4   2022-12-07 [1] CRAN (R 4.2.2)
#>  knitr         1.42    2023-01-25 [1] CRAN (R 4.2.2)
#>  lattice       0.20-45 2021-09-22 [3] CRAN (R 4.2.2)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.1)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  nanotime      0.3.7   2022-10-24 [1] CRAN (R 4.2.1)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.2.2)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.2.1)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.2.1)
#>  Rcpp          1.0.10  2023-01-22 [1] CRAN (R 4.2.2)
#>  RcppCCTZ      0.2.12  2022-11-06 [1] CRAN (R 4.2.2)
#>  RcppSpdlog  * 0.0.12  2023-01-07 [1] CRAN (R 4.2.2)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.2.1)
#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.1)
#>  rmarkdown     2.20    2023-01-19 [1] CRAN (R 4.2.2)
#>  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.2.1)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  spdl          0.0.4   2023-01-08 [1] CRAN (R 4.2.2)
#>  styler        1.9.0   2023-01-15 [1] CRAN (R 4.2.2)
#>  tiledb      * 0.18.0  2023-01-19 [1] CRAN (R 4.2.2)
#>  vctrs         0.5.2   2023-01-23 [1] CRAN (R 4.2.2)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun          0.37    2023-01-31 [1] CRAN (R 4.2.2)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.2.2)
#>  zoo           1.8-11  2022-09-17 [1] CRAN (R 4.2.1)
#> 
#>  [1] C:/Program Files/R/library
#>  [2] C:/Users/Constantine/AppData/Local/R/win-library/4.2
#>  [3] C:/Program Files/R/R-4.2.2/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

cgiachalis avatar Feb 04 '23 16:02 cgiachalis

Thanks for taking the time to file an issue. What is described here is correct -- and intentional. The R package wraps the TileDB Core API, and adheres to its interface. As this API uses zero-based indexing, so does the R interface to it.

eddelbuettel avatar Feb 04 '23 16:02 eddelbuettel

Thanks for your prompt reply! From R perspective, a user doesn't expect zero-based indexing but since it is intentional there is no problem - as long as it is documented somewhere; apologies if I missed it.

cgiachalis avatar Feb 04 '23 16:02 cgiachalis