XLSX.jl icon indicating copy to clipboard operation
XLSX.jl copied to clipboard

Promote mixed int/float columns to float

Open nilshg opened this issue 2 years ago • 1 comments

In Excel files which have decimal and round numbers in them, the round numbers are often displayed as integers unless the user deliberately adds decimal places.

When these columns are read in, currently this results in the column ending up as Any, even when using infer_eltypes = true.

In my opinion, there are three ways of dealing with this:

  • Promoting to float - I think this is preferrable, and likely what the user wants 99% of the time
  • Returning a Union{Int, Float64} column - preserves the type in case integers were intentionally stored as such in Excel
  • Returning a Real column - probably not ideal given the non-concrete type

Any of them is probably better than returning Any.

nilshg avatar Mar 07 '22 17:03 nilshg

One further observation on this: Julia Base would promote to float here

julia> [1.0, 2.0, 3]
3-element Vector{Float64}:
 1.0
 2.0
 3.0

which strongly points to option 1 being the best choice (indeed if one calls identity.(df) on the resulting DataFrame the Any columns turn into Float64)

nilshg avatar Mar 29 '22 09:03 nilshg