openxlsx2
openxlsx2 copied to clipboard
`wbWorkbook` methods relocation
List of functions that may be able to be moved from the wbWorkbook
class to another class. Suggestions welcomed. Most of these seem to be better suited for the wbWorksheet()
class on very quick inspection.
Todos
-
wbWorksheet
- [ ]
setSheetName()
(add toinitialize()
- [ ]
setRowHeights()
(consideroptions()
setting? (yes!)- requires rework of
rowHeights
, currently public field and can probably be changed to private -
rowHeights
is accessed duringsetRowHeights()
,addWorksheet()
,cloneWorksheet()
(which should be simplified)removeWorksheet()
, andprint()
-- but how is it brought into the final xml?
- requires rework of
- [ ]
createCols()
- [ ]
groupCols()
(is there ungrouping?) - [ ]
groupRows()
- [ ]
dataValidation()
-- how might we deal with list validations that are stored in separate worksheets? Possibly require awbWorksheet()
object be referenced (created before) the newwbWorksheet()
that uses the list? - [ ]
dataValidation_list()
- [ ]
conditionalFormatting()
- [ ]
mergeCells()
(renamecellMergeAdd()
?) - [ ]
removeCellMerge()
(renamecellMergeRemove()
?) - [ ]
freezePanes()
- [ ]
insertImage()
(Even though this one looks like a candidate, it places images in media and needs access to Content_Type. Requires to split the wb-function into separate parts.) - [ ]
addStyle()
- [x] ~~
conditionalFormatCell()
~~ this is deprecated
- [ ]
-
None : No assignments made, can be safely extracted
- [ ]
validateSheet()
- [ ]
validate_table_name()
- [ ]
check_overwrite_tables()
- [ ]
- Unsure
- [ ]
buildtable()
-- need to look into this more
- [ ]
Probably related to #34
There are both ungrouping functions for cols and rows. reworked these lately
Edited above
Re: insertImage()
and Content_Types
/media
. This looks like another one of these instances where if we delay making xml, it should be easier to manage. These could be handled before the actual workbook file creation with some private functions that make the necessary adjustments. It doesn't look easy though, and may be something that has to have at least two simultaneous changes.
Most of the xml is in a limbo state until the final calls of saveWorkbook()
. Parts are already in xml and only a fraction are different (and I like this, all these xml parts do not have to be created on saving. Higher chance of Excel reading the file). With insertImage()
(I have reworked this as well) it is like this - I have appended an asterix to indicate parts that could be handled by a relationship manager as indicated in #44:
- add entry to Content_Types*
- get next free image relationship/media id*
- assign images to media folder
- add extension to Content_Types
- add entry to drawing_rels*
- assign path to self$media
- create drawing
- append drawing as child to self$drawings
Once saveWorkbook()
is called, the files are added to the zip and the xml files are written. It is like this for many functions that do not live only on worksheet. The entire group of chart/drawing/pivotTable/slicer. They all require entries in worksheet_rels and Content_Types. The bad thing with this is, that excel is so #!@*&% picky. All other programs have no problem reading files, but Excel goes "it is broken, I cannot understand this file" if any entry to Content_Types is missing. Therefore it is quite tricky to get everything in order for Excel to understand (I've spend a good amount of time debugging what is troubling this wonderful piece of software).
And I really would want to reduce much of the code that does nothing but
- check if some part of workbook already has entries
- add entries or combine entries
- get next free id
- create some entry with
sprintf
orpaste
and hope that there wasn't already something; or that the rId was not already taken; or that it will be replaced by something else when writing; or that we use the the correct rId; or that it will survive any modification of the worksheet (adding sheets, removing sheets, adding drawings/tables etc.) - add this entry to various other places inside the code without any consistency checking.
But it is so darn hard to get to the root of what a function actually does :smiling_face_with_tear:
And I really would want to reduce much of the code that does nothing but
- check if some part of workbook already has entries
- add entries or combine entries
- get next free id
- create some entry with
sprintf
orpaste
and hope that there wasn't already something; or that the rId was not already taken; or that it will be replaced by something else when writing; or that we use the the correct rId; or that it will survive any modification of the worksheet (adding sheets, removing sheets, adding drawings/tables etc.)- add this entry to various other places inside the code without any consistency checking.
But it is so darn hard to get to the root of what a function actually does 🥲
100% on this. I don't think we need to check over, and over, and append, and append on so many small actions (newest edition to this family of problems is #57). This makes everything else so much more complicated and likely slows the package. I can see an eventual private function wb$gather_contents()
that does it only once and virtually eliminates it from all the other functions.
For the wrappers we could consider workbook
and worksheet
wrappers for example:
wb_freeze_panes(wb, sheet, ...) {
# sheet number or name
sheet <- validate_sheet(wb, sheet) # I think this is right?
# clone workbook
wb <- wb$clone()
# clone worksheet, update
ws <- wb$get_sheet(sheet)$clone()$freeze_panes(...)
# update worksheet
wb$set_sheet(sheet, ws)
}
ws_freeze_panes(ws, ...) {
ws$clone()$freeze_panes(...)
}