duckdb_spatial icon indicating copy to clipboard operation
duckdb_spatial copied to clipboard

Excel import of time column incorrectly casts to double/gibberish if no leading 0 present

Open Alex-Monahan opened this issue 2 years ago • 1 comments

Howdy! When using the nice workaround of loading Excel files into DuckDB using the spatial extension, time data is having issues if a leading zero is not present. I've attached 2 files: one that works correctly (which has leading zeroes), and one that fails.

Works: (01:00:00 format)

SELECT * 
FROM st_read('time_data_with_blanks.xlsx', layer='Sheet1');
time_column
00:26:00
NULL
02:26:00
03:26:00
04:26:00
05:26:00
06:26:00
07:26:00
08:26:00
09:26:00
10:26:00
11:26:00
12:26:00
13:26:00
14:26:00
15:26:00
16:26:00
17:26:00
18:26:00
19:26:00
20:26:00
21:26:00
22:26:00
23:26:00
00:26:00

Fails: (1:00:00 format)

SELECT * 
FROM st_read('time_data_with_blanks_no_leading_zero.xlsx', layer='Sheet1');
time_column
0.018055555555555554
NULL
0.10138888888888889
0.14305555555555555
0.18472222222222223
0.2263888888888889
0.26805555555555555
0.30972222222222223
0.35138888888888886
0.39305555555555555
0.43472222222222223
0.47638888888888886
0.5180555555555556
0.5597222222222222
0.6013888888888889
0.6430555555555556
0.6847222222222222
0.7263888888888889
0.7680555555555556
0.8097222222222222
0.8513888888888889
0.8930555555555556
0.9347222222222222
0.9763888888888889
1.0180555555555555

time_data_with_blanks.xlsx time_data_with_blanks_no_leading_zero.xlsx

Alex-Monahan avatar Aug 04 '23 20:08 Alex-Monahan

Hey!

Since this is an issue with an upstream third-party dependency we probably won't invest too much energy into fixing this, but a workaround would be to pass the newly added (merged in #114) FIELD_TYPES=[AUTO/STRING] "open option" to read all the fields as VARCHAR, at which point you can use DuckDB's casting logic to convert from VARCHAR to whatever types you expect.

Maxxen avatar Aug 14 '23 08:08 Maxxen

Hello!

As of DuckDB 1.2.0 the duckdb excel extension provides support for reading and writing xlsx files with greater efficiency and many more options. Therefore we're going to deprecate this feature in the spatial extension in the future. I would recommend that you try out the excel extension, and if you run into any problems please open an issue over at the duckdb-excel repository.

Maxxen avatar Feb 05 '25 21:02 Maxxen