openxlsx2
openxlsx2 copied to clipboard
Improve dims
Due to some merged cell styles sometimes something like A1;B2:C2;A3:C5
is needed to import to a data frame. Sadly this is currently not possible.
Our selection of multiple cells always creates a square from the most left to the most right cell:
library(openxlsx2)
x <- matrix(1:9, 3, 3)
wb <- wb_workbook()$add_worksheet()$add_data(x = x, col_names = FALSE)
wb_to_df(wb, dims = "A1;B2;C3", col_names = FALSE)
#> A B C
#> 1 1 4 7
#> 2 2 5 8
#> 3 3 6 9
It would be beneficial if this would return 1, 5, 8
instead, or a data frame where all other cells are NA
. But if I remember correctly this requires at least modifications of dims_to_dataframe()
and the question remains if this "A1;B2;C3"
should return a vector or a data frame of a single row or ... something entirely different.
My 2 cents:
I think wb_dim()
could handle this?
I specifically disallowed this when I wrote it to avoid unexpected things.
I think that keeping a length 1 would be good, (and create less surprise), but maybe I am wrong.
Do you think all functions can handle dims as "A1;B3"?
library(openxlsx2)
# Last version
packageVersion("openxlsx2")
#> [1] '1.0.0.9000'
wb_dims(x = mtcars, cols = "vs")
#> [1] "H2:H33"
wb_dims(x = mtcars, cols = "am")
#> [1] "I2:I33"
wb_dims(x = mtcars, cols = "cyl")
#> [1] "B2:B33"
# H2:I33
wb_dims(x = mtcars, cols = c("vs", "am"))
#> Error in wb_dims(x = mtcars, cols = c("vs", "am")): Supplying multiple column names is not supported by the `wb_dims()` helper, use the `cols` with a range instead of `x` column names.
#> Use a single `cols` at a time with `wb_dims()`
# B1:B33;I1:I33
wb_dims(x = mtcars, cols = c("cyl", "am"), select = "x")
#> Error in wb_dims(x = mtcars, cols = c("cyl", "am"), select = "x"): Supplying multiple column names is not supported by the `wb_dims()` helper, use the `cols` with a range instead of `x` column names.
#> Use a single `cols` at a time with `wb_dims()`
Created on 2023-09-27 with reprex v2.0.2
Other suggestions in https://github.com/JanMarvin/openxlsx2/pull/990#issuecomment-2052130605
Yes I do. But to make it work consistently, some development and lots of testing has to be done. After all there is a reason why it doesn't work right now
I was thinking about a data validation book chapter and was wondering if it makes sense to add sheet names to wb_dims()
. For this wb_dims()
should become a workbook wrapper, since it needs to know the sheet names.
Draft:
wb$dims(sheet = "MySheet", x = mtcars)
#> MySheet!A1:K33
wb$dims(sheet = "My Sheet", x = mtcars)
#> 'My Sheet'!A1:K33
If we grep for !
in dims
we could string split sheet
and dims
from there. In the current data validation list example we use a spreadsheet reference: "'Sheet 2'!$A$1:$A$10"
. This could be provided by something like: wb$dims(sheet = 2, cols = 1, rows = seq_len(10))
. But I'm not entirely sure where else we would benefit from this instead of potentially being closer to openxml formulas.
The following functions would be identical:
wb$to_df(dims = wb$dims(sheet = "Sheet2", x = matrix(1, 2, 2), col_names = FALSE)
wb$to_df(dims = "Sheet2!A1:B2")
wb_to_df(wb, sheet = "Sheet2", dims = wb_dims(wb = wb, x = matrix(1, 2, 2), col_names = FALSE)
wb_to_df(wb, sheet = "Sheet2", dims = "A1:B2")
The data validation example would become something like this:
# data validation example
wb <- wb_workbook()$
add_worksheet("Sheet 1")$
add_worksheet("Sheet 2")$
add_data_table(sheet = 1, x = iris[1:30, ])$
add_data(sheet = 2, x = sample(iris$Sepal.Length, 10))
# with wb_dims draft
wb$add_data_validation(
dims = wb$dims(sheet = 1, x = iris[1:30, ], cols = 1),
type = "list",
# "'Sheet 2'!$A$1:$A$10"
value = wb$dims(sheet = 2, cols = 1, rows = seq_len(10))
)
Further dims
improvements:
It should be possible to select every row/col similar to spreadsheet software:
-
$A:$D
every row in the column range (we partially support this iswb_data()
iirc), at least it is possible in pivot table creation -
$1:$5
every column in the row range -
Support dollar signs in
dims
(it does not matter to us, but we currently fail to read fromwb_to_df(wb, dims = "$A1:$B2")
-
If sheet and dims are written as single string
wb_to_df(wb, dims = "Sheet1!A1:B2")
we should be able to split sheet and dims intosheet
anddims