fastexcel icon indicating copy to clipboard operation
fastexcel copied to clipboard

Parsing cell with formula obscure the current value type

Open robmv opened this issue 1 month ago • 1 comments

I am importing data and don't really have an use for formulas, I need to validate the data type of a cell, and currently if the spreadsheet is using formulas, the calculated result type of it is lost. for example, the cell:

<c r="B3" s="1" t="n">
  <f aca="false">E3</f>
  <v>30</v>
</c>

If I am correct, the calculated type is NUMBER t="n" and the value is 30 <v>30</v>, and the formula is E3 (just an simple test). At:

https://github.com/dhatim/fastexcel/blob/b6df0f7a335c68870ac73a54f794ddeccf7424d2/fastexcel-reader/src/main/java/org/dhatim/fastexcel/reader/RowSpliterator.java#L194

and:

https://github.com/dhatim/fastexcel/blob/b6df0f7a335c68870ac73a54f794ddeccf7424d2/fastexcel-reader/src/main/java/org/dhatim/fastexcel/reader/RowSpliterator.java#L316

but not sure much about this one:

https://github.com/dhatim/fastexcel/blob/b6df0f7a335c68870ac73a54f794ddeccf7424d2/fastexcel-reader/src/main/java/org/dhatim/fastexcel/reader/RowSpliterator.java#L337-L338

If the cell has a formula, the parsed Cell will have FORMULA as its type, the knowledge that the calculated result stored on the cell from that formula is of type NUMBER is lost.

Reading the number value fails because it expects to not have a formula:

https://github.com/dhatim/fastexcel/blob/b6df0f7a335c68870ac73a54f794ddeccf7424d2/fastexcel-reader/src/main/java/org/dhatim/fastexcel/reader/Cell.java#L79-L82

I think CellType should not have an entry for FORMULA, it is not an exclusive thing, for example a cell can contain a value of type NUMBER but being calculated by a formula. Probably is better to have something like:

class Cell ... {
    ...
    public boolean isFormula() {
        return formula != null;
    }
    ...
}

This could be breaking change, Maybe if a non breaking change is needed, an option can be added to ignore formula definitions when parsing. Until a new major release fix this problem, removing FORMULA as an exclusive type of a cell.

I maybe wrong, but my usage of fastexcel-reader is to import data generated by users, and if I lose the data types of the data because of the formulas, I can't import it not being sure there are numbers on one column or strings that resemble numbers.

robmv avatar Nov 05 '25 18:11 robmv