xlconnect
xlconnect copied to clipboard
writeNamedRegion does not overwrite existing formula cells
writeNamedRegion
(and probably similarly writeWorksheet
) won't overwrite existing formula cells (cells with cell type FORMULA
). This seems to be a change in behavior after XLConnect 0.2-15 (possibly in underlying POI).
Consider the following reproducible example:
require(XLConnect)
# Create test data
wb <- loadWorkbook("repro.xlsx", create = TRUE)
createSheet(wb, "repro")
createName(wb, name = "repro", formula = "repro!$A$1")
data <- data.frame(A = 1, B = 0)
writeNamedRegion(wb, data = data, name = "repro")
setCellFormula(wb, sheet = "repro", row = 2, col = 2, formula = "A2 + 100")
saveWorkbook(wb)
# Quick check
wb <- loadWorkbook("repro.xlsx")
data <- readNamedRegion(wb, name = "repro")
stopifnot(identical(data, data.frame(A = 1, B = 101)))
# Reproducing the problem
newData <- data.frame(A = 888, B = 666)
wb <- loadWorkbook("repro.xlsx")
existingData <- readNamedRegion(wb, name = "repro")
writeNamedRegion(wb, data = newData, name = "repro")
saveWorkbook(wb, file = "new.xlsx")
wb <- loadWorkbook("new.xlsx")
writtenData <- readNamedRegion(wb, name = "repro")
stopifnot(identical(writtenData, data.frame(A = 888, B = 666)))
The above example (final check) passes with XLConnect 0.2-15 but it does not pass with XLConnect >= 1.0.0.
On the POI side, it is documented (at least since POI 3.17) that Cell.setCellValue
"will set the precalculated value" for formula cells (see https://poi.apache.org/apidocs/3.17/org/apache/poi/ss/usermodel/Cell.html#setCellValue(double) or https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/Cell.html#setCellValue-double- for current POI).
"precalculated" value seems to refer to cached values. And this seems indeed to be the case: if you use writtenData <- readNamedRegion(wb, name = "repro", useCachedValues = TRUE)
the above passes with XLConnect >= 1.0.0. Alternatively, introducing a clearNamedRegion(wb, name = "repro")
before overwriting the data makes the above pass since clearNamedRegion
effectively removes the cells.
I see several possibilities how we could address this:
- always clear cells before using
setCellValue
(Cell.setBlank
should probably do the job) - this would forcibly overwrite formula cells always - introduce a new argument
overwriteFormulaCells = TRUE
towriteNamedRegion
andwriteWorksheet
to control behavior as to whether formula cells should be cleared (setBlank
) or not - introduce a new argument
clearCells = TRUE
towriteNamedRegion
andwriteWorksheet
to control behavior as to whether cells should be cleared in general before writing - this is quite similar to the above with the difference that a clear would always be done if set toTRUE
(note, however, that this is not equivalent to issuing aclearNamedRegion
sinceclearNamedRegion
would always clear the full named region whilewriteNamedRegion(..., clearCells=TRUE)
would only clear the portion of the named region that is effectively being overwritten)
Option 2 seems to more directly express the purpose of the operation while option 3 seems to be closer to the existing clearNamedRegion
. I'm also open for other options.
Since this behavior has been in place for a while, I'm hesitant to make the default TRUE. I'm more in favor of option 2 as it is clearer why it would be needed - according to the doc, other types of cells would be set to a new value anyway, correct ?
@spoltier
other types of cells would be set to a new value anyway, correct ?
Yes that is my understanding. The question is what users expect in terms of default behavior. I think my default expectation is that the data I provide overwrites any cells independent of their type. I think it is useful to be able to keep formula cells though. That's why I would personally opt for option 2 with default TRUE
. Interested to hear what others think. @Chargothrond @RolandASc
My default expectation would also be that writeNamedRegion / writeWorkSheet would overwrite formula cells, especially since that used to be the behavior.
Option 2 makes most sense to me as well, with the option to set overwriteFormulaCells to FALSE (i.e. default TRUE).