pointblank icon indicating copy to clipboard operation
pointblank copied to clipboard

`get_data_extracts()` gets less data for `rows_distinct()` than for `col_vals_*()⁠`

Open mayeulk opened this issue 2 years ago • 1 comments

Description

get_data_extracts() behaves differently for validation functions of the form ⁠col_vals_*()⁠, conjointly() and rows_distinct() For rows_distinct, the tibble contains only the tested columns, contrary to the other functions.

Reproducible example

library(pointblank)
library(dplyr)

tbl <- tibble(id1=1:5,
                     id2=c("A", "b", "C", "D", "E"),
                     a = c(8, 8, 8, 5, 9),
                     b = c(11,11:14),
                     date = as.Date(paste0("2023-01-0",1:5)))

# The columns or set of columns that need to be displayed
# (to help identify the row) along the column with invalid value
id_columns <- c("id1", "id2")

agent <-
  create_agent(
    tbl = tbl,
    tbl_name = "small_table",
    label = "An example."
  ) %>%
  col_vals_gt(columns = vars(a), value = 6) %>%
  col_vals_gt(columns = vars(b), value = 11) %>%
  col_vals_regex(columns = vars(id2), regex = "[A-Z]")   %>%
  rows_distinct(columns = vars(a))  %>%
  rows_distinct(columns = c("b"))  %>%
  rows_distinct(columns = c("a", "b"))  %>%
  conjointly(
    ~ col_vals_lt(., columns = vars(a), value = 7),
    ~ col_vals_gt(., columns = vars(a), value = vars(b)))   %>%
  col_is_date(columns =  "date") %>%
  interrogate()

agent

agent %>% get_agent_report(display_table = FALSE)

# Loop over each step and display a selection of columns from failing rows
for (c_step in 1:nrow(get_agent_report(agent, display_table = F))){
  print("====================")
  get_agent_x_list(agent, i = c_step)$briefs %>% print
  print(c("current step: ", c_step))
  get_agent_x_list(agent, i = c_step)$columns %>% print
  columns_to_display <- unique c((id_columns, get_agent_x_list(agent, i = c_step)$columns ))
  get_data_extracts(agent, i=c_step)  %>%
    select(columns_to_display)  %>%  # comment this line out to see the result
    print
}

Expected result

For the col_vals_*()⁠ and conjointly() function, get_data_extracts() returns all columns, which allows further selection of the columns one wishes to keep for display. However, for rows_distinct(columns=vars(a)), only the 'a' column remains. I do not know of a way to get the full rows for the failing rows with pointblank. Using agent %>% get_agent_report(display_table = TRUE), the same issue holds for the "CSV" buttons.

In the example, we want two columns, id1 and id2, to be displayed (to help identify the failing row) along with the column with invalid values. Commenting out the following line in the code above helps see the difference in behaviour: select(columns_to_display) %>%

Session info

sessionInfo() R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 23.04

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.11.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0

locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C LC_TIME=fr_FR.UTF-8
[4] LC_COLLATE=fr_FR.UTF-8 LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_1.1.2 pointblank_0.11.4

loaded via a namespace (and not attached): [1] rstudioapi_0.14 xml2_1.3.3 magrittr_2.0.3 tidyselect_1.2.0 gt_0.9.0 R6_2.5.1
[7] rlang_1.1.0 fastmap_1.1.1 fansi_1.0.4 tools_4.2.2 xfun_0.39 utf8_1.2.3
[13] blastula_0.3.3 cli_3.6.1 withr_2.5.0 commonmark_1.9.0 htmltools_0.5.5 digest_0.6.31
[19] tibble_3.2.1 lifecycle_1.0.3 crayon_1.5.2 sass_0.4.5 base64enc_0.1-3 vctrs_0.6.2
[25] glue_1.6.2 compiler_4.2.2 pillar_1.9.0 generics_0.1.3 markdown_1.6 pkgconfig_2.0.3

mayeulk avatar May 02 '23 23:05 mayeulk

I edited my example, which missed c(...) in columns_to_display <- unique (c(id_columns, get_agent_x_list(agent, i = c_step)$columns ))

mayeulk avatar May 03 '23 13:05 mayeulk