org-mode-ox-odt icon indicating copy to clipboard operation
org-mode-ox-odt copied to clipboard

Add a way to convert an XLSX/ODS file with formula in to a csv file such that it provides insights in to the formula being used

Open kjambunathan opened this issue 3 years ago • 7 comments

Add a way to convert an XLSX/ODS file with formula in to a csv file such that it provides insights in to the formula being used

This issue is filed on behalf of @ouboub

This is a sister bug of Add a command to convert an Org table to xlsx · Issue #10 · kjambunathan/org-mode-ox-odt.

See this thread export org table to other formats (gnumeric or scalc or xlsx)

And also this issue import of xlsx failed in https://hub.stenci.la · Issue #1030 · stencila/stencila

An example ods file attached to that file is also attached to this issue

Exper-sten.ods

kjambunathan avatar Jul 17 '21 06:07 kjambunathan

Add a way to convert an XLSX/ODS file with formula in to a csv file such that it provides insights in to the formula being used

(Exper-sten.ods will be used for the purpose of this example.)

First I outline a solution with LibreOffice UI and then outline how that solution may be carried out in command line.

Solution using LibreOffice UI

Enable Macro Recording in LibreOffice

Do LibreOffice Menubar -> Tools -> Options -> LibreOfficeDev -> Advanced -> Optional Features -> Enable macro recording (may be limited) -> Turn On this checkbox

Screenshot from 2021-07-17 12-02-55

Save XLSX/ODS fil as CSV file with the following options.

Calc's Menubar -> File -> Save as -> choose File type as TextCSV(.csv) & also turn on Edit Filter Settings -> Save -> Confirm File Format as Use TextCSV Format -> enable Save Cell formulas instead of calculated values -> DONE.

See screenshot for details

Screenshot from 2021-07-17 12-08-22

Screenshot from 2021-07-17 12-13-55

Here is a CSV file created with the above procedure:

Exper-sten.csv

Now repeat step (2) with macro recording enabled

Calc Menubar -> Tools -> Macros -> Record Macro.

Once the ODS file is saved as CSV, save the macro to your Basic Library at some location. Now inspect the Basic Code created by the macro recorder. Here is the code that LibreOffice created for me:


REM  *****  BASIC  *****


sub Main
rem ----------------------------------------------------------------------
rem define variables
dim document   as object
dim dispatcher as object
rem ----------------------------------------------------------------------
rem get access to the document
document   = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")

rem ----------------------------------------------------------------------
dim args1(2) as new com.sun.star.beans.PropertyValue
args1(0).Name = "URL"
args1(0).Value = "file:///home/kjambunathan/Downloads/junk/Exper-sten1.csv"
args1(1).Name = "FilterName"
args1(1).Value = "Text - txt - csv (StarCalc)"
args1(2).Name = "FilterOptions"
args1(2).Value = "44,34,76,1,,0,false,true,false,true,false"

dispatcher.executeDispatch(document, ".uno:SaveAs", "", 0, args1())


end sub

Make a note of the FilterOptions in the above Basic code which is

"44,34,76,1,,0,false,true,false,true,false"

We will be passing the above option to the command line.

Solution using Command Line

For comparing the results of conversion using UI and command line, rename Exper-sten.ods to test.ods. Use this test file with command line.

libreofficedev7.2 -convert-to csv:"Text - txt - csv (StarCalc)":44,34,76,1,,0,false,true,false,true,false test.ods

(I use LibreOffice-distributed nightly build. So, the command for me is libreofficedev7.2. In standard Debian-packaged LibreOffice, replace it with soffice. (I think) the binary is called soffice.bin on Windows. It is also called differently on Macs. Use whatever is appropriate for your installation.)

Here is the test.csv file produced by above command

test.csv

Now open a test.org buffer within Emacs, and do M-x org-table-import on the above file. Here is the test.org created.

test.org.txt

Now when in org-mode buffer do Menu bar -> Tbl -> Show Col/Row Numbers bound to C-c } or equivalently M-x org-table-toggle-coordinate-overlays

(custom-set-variables
 '(org-table-use-standard-references t))

This is a screenshot of what you will get. You will get a overview of the formulae used.

Screenshot from 2021-07-17 12-42-19

Screenshot from 2021-07-17 12-42-00

kjambunathan avatar Jul 17 '21 07:07 kjambunathan

libreofficedev7.2 -convert-to csv:"Text - txt - csv (StarCalc)":44,34,76,1,,0,false,true,false,true,false test.ods

The recipe shared here is a slightly expanded version of How do I export and import formulas to CSV? [closed] - Ask LibreOffice.

It looks like a reverse process of exporting formulae-embedded CSV -> regular formulae-embedded XLSX/ODS is also possible. Need to figure out whether it is possible via command line, and what the command line options are.

kjambunathan avatar Jul 17 '21 07:07 kjambunathan

"JK" == Jambunathan K @.***> writes:

Add a way to convert an XLSX/ODS file with formula in to a csv file such that it provides insights in to the formula being used

(Exper-sten.ods will be used for the purpose of this example.)

libreofficedev7.2 -convert-to csv:"Text - txt - csv (StarCalc)":44,34,76,1,,0,false,true,false,true,false test.ods

Here is the test.csv file produced by above command

test.csv

Thanks, I repeated the steps above (but only for the command line) and can reproduce your results. That is a very interesting approach, I went step further and imported the file into org mode. It requires some manual adjustment, since the concepts are quite different (org table is based on operations entirely on a column represented for example by F&) while scalc and other spreadsheets use a different notation. Moreover in org-table the if needs, both conditions specified so

The scalc formula

=IF(F4>$K$27,(F4-$K$27)$AE$3)+MIN($K$27,F4)$G$3+MIN($K$27,G4)$H$3+IF(H4>$K$27,(H4-$K$27)$AE$3)+MIN($K$27,H4)$I$3+I4$J$3

Has to be

=if(F&>K27,(F&-K27)*AE3,0)+MIN(K27,F&)*G3+MIN(K27,G4)*H3+IF(H&>K27,(H&-K27)*AE3,0)+MIN(K27,H&)*I3+I&*J3

In org mode

But that cannot be help at the moment I think

ouboub avatar Jul 17 '21 08:07 ouboub

tidyxl-https://nacnudus.github.io/tidyxl/articles/tidyxl.html -, a R package is an interesting package that is worth trying

It could help you deal with a "wall-of-formula-text", and help you get your head around it.

The package documentation claims that it should be able to identify what it calls as shared formulas. I am quoting from the documentation

Shared formulas

Cells A20 and A21 illustrate how formulas are normalised before being written to file, using the formula_ref and formula_group columns. When there is a group of cells whose formulas only differ by cell reference (e.g. “=A1+1”, “=A2+1”, “=A3+1”, etc.), only one formula in each group is written to the file, so tidyxl infers what the formulas in the other cells in the group must be, from their relative positions.

Going by that claim it should be able to "group" the formulae that occur in column J at cells J4, J5 etc. But it doesn't. It looks like a bug or an incomplete feature to be. You may find it worthwhile to file a bug against the tidyxl package. The benefit that you get is that you will know which formulae are merely a repetition of a templat, and which aren't. This way you can focus on the "unique" formulae. The benefit to the tidyxl package is that they get to see a usecase. (Btw, tidyxl seems to be associated with the UK government)

test.xlsx

rsession.txt

not-recognised-as-formula-group.txt

I thought I will share this with you ... even though it has nothing to do with the ODT exporter as such.

kjambunathan avatar Jul 18 '21 11:07 kjambunathan

thanks a lot, that looks quite interesting. If I understand it correctly, the package should convert from xlsx to R. (that could be included into orgmode, for course)

ouboub avatar Jul 18 '21 13:07 ouboub

No. It identifies formula groups.

On Sun, 18 Jul, 2021, 18:45 Uwe Brauer, @.***> wrote:

thanks a lot, that looks quite interesting. If I understand it correctly, the package should convert from xlsx to R. (that could be included into orgmode, for course)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kjambunathan/org-mode-ox-odt/issues/94#issuecomment-882055068, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXTBKDV7VYXT6OLIUHI6VDTYLHY7ANCNFSM5AQYEVJA .

kjambunathan avatar Jul 18 '21 14:07 kjambunathan

The other way around, from CVS-->ods with formulas.

I started with

| First |   |     |
|-------+---+-----|
|     3 | 6 |   9 |
|     1 | 3 |   4 |
|     1 | 2 |   3 |
|     8 | 9 |  17 |
|   100 | 1 | 101 |
#+TBLFM: $3=vsum($1..$2);f-2::@2$3=vsum(@2$1..@2$2)

and that I add to change to


|   3 | 6 | =SUM($A1:$B1) |
|   1 | 3 | =SUM($A2:$B2) |
|   1 | 2 | =SUM($A3:$B3) |
| 100 | 1 | =SUM($A4:$B4) |

and then to export to csv, and then to important to LO, it worked. I wounder whether there could be any faster method, maybe a lisp function that converts

vsum($1..$2); to =SUM($A1:$B1)

I will also asks this on the org-mailing list. Not that I make myself too many illustions.

ouboub avatar Jul 12 '22 06:07 ouboub

I have never used Org table formula or calc. I need to catch up on the documentation. I will look in to your request.

Please upload a org file with table formulas, which comprehensively covers the formula features you use, or is "good enough" (for you), I will use that as a starting point for further investigation.

kjambunathan avatar Jul 12 '22 09:07 kjambunathan

Thanks, I attach a zip file with a README file that explains the purpose of each file HG-CVS-Formular.zip

ouboub avatar Jul 13 '22 07:07 ouboub

Which one is the most useful

  1. Start from org, and end up with ods
  2. Start from ods, and end up with org

Pick one of the above. I will attempt that first.

kjambunathan avatar Jul 13 '22 15:07 kjambunathan

"JK" == Jambunathan K @.***> writes:

Which one is the most useful

  1. Start from org, and end up with ods
  2. Start from ods, and end up with org

I would definitely consider 1 the more important problem.

-- I strongly condemn Putin's war of aggression against the Ukraine. I support to deliver weapons to Ukraine's military. I support the ban of Russia from SWIFT. I support the EU membership of the Ukraine.

ouboub avatar Jul 13 '22 17:07 ouboub