dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Excel add new sheet without overwriting the file

Open LeandroC89 opened this issue 2 years ago • 3 comments

Hello, I've made a couple changes in order to be able to add sheets to an already existing file (without completely replacing it).

Overview

By using a keepFile parameter the user can now decide to add a new sheet to an Excel file, this parameter is optional and defaults to false to keep the current behavior by default.

Implementation

keepFile

  • Added the new parameter as an optional setting allowing the user to add a sheet to an existing file (if set as true), leaving it blank or setting as false will keep current behavior, completely overwriting the file.

Factory

I had to update the logic by which the workbook was determined for a couple reasons:

  1. Current setting would always create a new file, so a new flow had to be added
  2. Keeping the factory function was generating 0b files when trying to reuse existing file, so it is now returning a WorkBook (keeping the wb type logic as is for xls and xlsx files).

Please let me know if something here needs to be updated as this was the less easy part of the change

Test

  • Added test to test the feasibility of creating/overwriting an Excel file with data for a sheet1, and then the same data is written on a sheet2.
    • Test checks if both sheet contents are the same

.gitignore

  • Added new entry so the Excel file generated by the new test is not kept in version control, as it would have changes everytime the tests were rerun.

Final comments

I hope this helps, and please let me know if there is something I should update or that could have been done in a better way. Have a good weekend!

LeandroC89 avatar Aug 28 '22 01:08 LeandroC89

Hi! Thank you for the PR! What do you think about creating Workbook in the user code? Like

val wb = ...
df1.writeExcel(wb, "Sheet1")
df2.writeExcel(wb, "Sheet2")

I like your idea better, because this way people won't have to deal with factories, streams. But still interesting to understand alternatives

koperagen avatar Aug 29 '22 12:08 koperagen

Hi, not a bad a idea, having null create and if a wb is provided it would add. I could even add that having writeExcel return the wb would help in this situation as I see most people would use this to create a new file and then add sheets to it, so this would be an easier way to get the wb.

However, this would require people to create the wb on their own and import the dependencies on their own (and apply the xls/xlsx logic that determines HSSF/XSSF). I feel this latest bit justifies leaving this to the function instead. Specially in Notebooks where the project can be imported with

%use dataframe

I'm not entirely sure as I haven't touched notebooks in a while but I believe in this situation the dependency to the workbook would also have to be added with the import statement which is not as straightforward.

Considering the pros and cons I would have this handled by the function writeExcel itself, although it can also be overloaded to allow both behaviors. (I can also submit this bit if needed)

LeandroC89 avatar Aug 29 '22 16:08 LeandroC89

I agree, and writing several sheets is a common enough task to be solved by writeExcel itself. So keepFile is good. Could you also add a paragraph with an example here https://github.com/Kotlin/dataframe/blob/master/docs/StardustDocs/topics/write.md? Create a new function in org.jetbrains.kotlinx.dataframe.samples.api.Write and follow the instruction https://github.com/Kotlin/dataframe/blob/master/docs/contributions.md.

koperagen avatar Sep 07 '22 19:09 koperagen

@LeandroC89 Could you update the pr to the newest master? Then we might be able to merge it for the 0.9.0 release

Jolanrensen avatar Nov 28 '22 13:11 Jolanrensen

Sorry it took so long. At the time there was an issue with Korro, and I haven't checked this one in a while after the import was removed! It touches a few more files now due to having taken all the changes since originally PR'ed.

LeandroC89 avatar Dec 17 '23 21:12 LeandroC89

Run formatKotlin task please, then i can merge it

koperagen avatar Dec 19 '23 11:12 koperagen