fastexcel icon indicating copy to clipboard operation
fastexcel copied to clipboard

Option to retain special symbols/ errors in excel as a string

Open durgeksh opened this issue 10 months ago • 2 comments

Hi team, in excel, we have special symbols like #VALUE!, #DIV/0! etc. As of now, these symbols are turned into null. Can we have an option to retain them as it is as a string?

Thank you for the wonderful and performant wrapper.

durgeksh avatar Apr 08 '24 12:04 durgeksh

https://github.com/ToucanToco/fastexcel/issues/220#issuecomment-2042627835 for reference

PrettyWood avatar Apr 10 '24 09:04 PrettyWood

Hello @durgeksh I pushed https://github.com/ToucanToco/fastexcel/pull/293 to showcase the current behavior, which seems ok to me. If it's not can you please share a file and explain the current behavior and the desired one?

PrettyWood avatar Oct 05 '24 10:10 PrettyWood

Hello @PrettyWood , I've gone over the tests pushed in #293 and I'm unable to replicate the desired behaviour. I can't find a way to read excel error codes as strings - they are always transcribed as null. My environment:

Linux kernel 6.8.0 on x86-64
Ubuntu 24.04.1 LGS

In the following tables, the column 'from_m' has 5 broken formulas, so the error code for each is #REF!. The column litho_unit is empty, so its null values are expected and desired. Herein lies the problem: if I'm unable to display and query for excel error codes, then I can't differentiate true null values from those nulls that are hiding broken excel functionality. Information is being lost. Note that all columns in the source excel sheet are nominally 'text' type; that is for workflow reasons beyond my control. Regardless, the source column type should not impact the display of Excel error codes: in this case, the columns should be brought in as strings, and the five cells with broken formulas should display #REF! in the dataframe. Apologies for the long format, but I want to show my 'tests' in full:

$ python --version                                                                                                                                                                                                                                            [127:NOTFND]|ARIS-Scripts|off|.aris|ref_error_handling
Python 3.12.6
$ uv pip show polars fastexcel                                                                                                                                                                                                                                         [0]|ARIS-Scripts|off|.aris|ref_error_handling
Using Python 3.12.6 environment at .aris
Name: fastexcel
Version: 0.12.0
Location: /path/to/venv
Requires: pyarrow
Required-by:
---
Name: polars
Version: 1.14.0
Location: /path/to/venv
Requires:
Required-by:
$ python                                                                                                                                                                                                                                                               [0]|ARIS-Scripts|off|.aris|ref_error_handling
Python 3.12.6 (main, Sep  9 2024, 22:11:19) [Clang 18.1.8 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import polars as pl
>>> partial_path = "/path/to/30061_test_broken_formula_partial_column_litho.xlsx"
>>> df = pl.read_excel(partial_path, read_options={'schema_sample_rows':30,})
>>> with pl.Config(set_tbl_rows=30): print(df)
shape: (29, 5)
┌───────────┬────────┬───────┬────────────┬─────────────────────────────────┐
│ hole_name ┆ from_m ┆ to_m  ┆ litho_unit ┆ comments                        │
│ ---       ┆ ---    ┆ ---   ┆ ---        ┆ ---                             │
│ str       ┆ str    ┆ str   ┆ null       ┆ str                             │
╞═══════════╪════════╪═══════╪════════════╪═════════════════════════════════╡
│ S07-01    ┆ 0.0    ┆ 2.2   ┆ null       ┆ overburden                      │
│ S07-01    ┆ 2.2    ┆ 150.0 ┆ null       ┆ Dacitic Ash Flow Tuff (TS)      │
│ S07-01    ┆ 2.2    ┆ 3.0   ┆ null       ┆ Csg. Fresh Dacitic Ash Flow Tu… │
│ S07-01    ┆ 3.0    ┆ 4.0   ┆ null       ┆ Fresh daft.  Bockv fracturino   │
│ S07-01    ┆ 4.0    ┆ 4.15  ┆ null       ┆ 15cm of coarse 3.0-5.0mm feld … │
│ S07-01    ┆ 5.3    ┆ 6.0   ┆ null       ┆ around core - no record         │
│ S07-01    ┆ 6.55   ┆ 6.95  ┆ null       ┆ Silty sand- type material reco… │
│ S07-01    ┆ 7.0    ┆ 7.5   ┆ null       ┆ Bleached, altered dacite, no s… │
│ S07-01    ┆ 8.0    ┆ 9.0   ┆ null       ┆ fresh blocky fracture TS        │
│ S07-01    ┆ 9.9    ┆ 10.0  ┆ null       ┆ Mod. Clav altered TS one thin … │
│ S07-01    ┆ null   ┆ 11.0  ┆ null       ┆ Mod clay altered TS  Blocky br… │
│ S07-01    ┆ null   ┆ 12.0  ┆ null       ┆ very weekly altered TS broken … │
│ S07-01    ┆ null   ┆ 13.8  ┆ null       ┆ 12.0-13.8mostly gr. And lost g… │
│ S07-01    ┆ null   ┆ 14.7  ┆ null       ┆ Mod clay altered zone with qua… │
│ S07-01    ┆ null   ┆ 16.0  ┆ null       ┆ wk-mod clay altered TS          │
│ S07-01    ┆ 16.0   ┆ 17.0  ┆ null       ┆ wk clay altered TS              │
│ S07-01    ┆ 18.2   ┆ 18.3  ┆ null       ┆ Cal healino bx TX at  20-30° C… │
│ S07-01    ┆ 19.1   ┆ 19.95 ┆ null       ┆ Very intensely clay altered TS… │
│ S07-01    ┆ 20.0   ┆ 21.0  ┆ null       ┆ fresh to very weakly clay alte… │
│ S07-01    ┆ 21.1   ┆ 21.95 ┆ null       ┆ Highly broken crumbly core, mo… │
│ S07-01    ┆ 22.0   ┆ 23.0  ┆ null       ┆ Fresh to  v.  weak altered TS   │
│ S07-01    ┆ 23.0   ┆ 24.0  ┆ null       ┆ verv uniform unalterd TS        │
│ S07-01    ┆ 24.0   ┆ 28.0  ┆ null       ┆ uniform unaltered TS            │
│ S07-01    ┆ 28.0   ┆ 29.0  ┆ null       ┆ rock is weaklv oxidized. oranq… │
│ S07-01    ┆ 29.1   ┆ 29.4  ┆ null       ┆ finer grained band of tuff in … │
│ S07-01    ┆ 31.05  ┆ 32.20 ┆ null       ┆ Several lim coated fractures a… │
│ S07-01    ┆ 34.5   ┆ 35.0  ┆ null       ┆ major fault - gg and soft dk g… │
│ S07-01    ┆ 35.0   ┆ 36.0  ┆ null       ┆ soft partly bx, fr and altd fo… │
│ S07-01    ┆ 36.0   ┆ 37.15 ┆ null       ┆ Perv K-spr altd zone with qtz-… │
└───────────┴────────┴───────┴────────────┴─────────────────────────────────┘
>>> df = pl.read_excel(partial_path, read_options={'schema_sample_rows':1,'dtypes':'string'})
>>> with pl.Config(set_tbl_rows=30): print(df)
shape: (29, 5)
┌───────────┬────────┬───────┬────────────┬─────────────────────────────────┐
│ hole_name ┆ from_m ┆ to_m  ┆ litho_unit ┆ comments                        │
│ ---       ┆ ---    ┆ ---   ┆ ---        ┆ ---                             │
│ str       ┆ str    ┆ str   ┆ str        ┆ str                             │
╞═══════════╪════════╪═══════╪════════════╪═════════════════════════════════╡
│ S07-01    ┆ 0.0    ┆ 2.2   ┆ null       ┆ overburden                      │
│ S07-01    ┆ 2.2    ┆ 150.0 ┆ null       ┆ Dacitic Ash Flow Tuff (TS)      │
│ S07-01    ┆ 2.2    ┆ 3.0   ┆ null       ┆ Csg. Fresh Dacitic Ash Flow Tu… │
│ S07-01    ┆ 3.0    ┆ 4.0   ┆ null       ┆ Fresh daft.  Bockv fracturino   │
│ S07-01    ┆ 4.0    ┆ 4.15  ┆ null       ┆ 15cm of coarse 3.0-5.0mm feld … │
│ S07-01    ┆ 5.3    ┆ 6.0   ┆ null       ┆ around core - no record         │
│ S07-01    ┆ 6.55   ┆ 6.95  ┆ null       ┆ Silty sand- type material reco… │
│ S07-01    ┆ 7.0    ┆ 7.5   ┆ null       ┆ Bleached, altered dacite, no s… │
│ S07-01    ┆ 8.0    ┆ 9.0   ┆ null       ┆ fresh blocky fracture TS        │
│ S07-01    ┆ 9.9    ┆ 10.0  ┆ null       ┆ Mod. Clav altered TS one thin … │
│ S07-01    ┆ null   ┆ 11.0  ┆ null       ┆ Mod clay altered TS  Blocky br… │
│ S07-01    ┆ null   ┆ 12.0  ┆ null       ┆ very weekly altered TS broken … │
│ S07-01    ┆ null   ┆ 13.8  ┆ null       ┆ 12.0-13.8mostly gr. And lost g… │
│ S07-01    ┆ null   ┆ 14.7  ┆ null       ┆ Mod clay altered zone with qua… │
│ S07-01    ┆ null   ┆ 16.0  ┆ null       ┆ wk-mod clay altered TS          │
│ S07-01    ┆ 16.0   ┆ 17.0  ┆ null       ┆ wk clay altered TS              │
│ S07-01    ┆ 18.2   ┆ 18.3  ┆ null       ┆ Cal healino bx TX at  20-30° C… │
│ S07-01    ┆ 19.1   ┆ 19.95 ┆ null       ┆ Very intensely clay altered TS… │
│ S07-01    ┆ 20.0   ┆ 21.0  ┆ null       ┆ fresh to very weakly clay alte… │
│ S07-01    ┆ 21.1   ┆ 21.95 ┆ null       ┆ Highly broken crumbly core, mo… │
│ S07-01    ┆ 22.0   ┆ 23.0  ┆ null       ┆ Fresh to  v.  weak altered TS   │
│ S07-01    ┆ 23.0   ┆ 24.0  ┆ null       ┆ verv uniform unalterd TS        │
│ S07-01    ┆ 24.0   ┆ 28.0  ┆ null       ┆ uniform unaltered TS            │
│ S07-01    ┆ 28.0   ┆ 29.0  ┆ null       ┆ rock is weaklv oxidized. oranq… │
│ S07-01    ┆ 29.1   ┆ 29.4  ┆ null       ┆ finer grained band of tuff in … │
│ S07-01    ┆ 31.05  ┆ 32.20 ┆ null       ┆ Several lim coated fractures a… │
│ S07-01    ┆ 34.5   ┆ 35.0  ┆ null       ┆ major fault - gg and soft dk g… │
│ S07-01    ┆ 35.0   ┆ 36.0  ┆ null       ┆ soft partly bx, fr and altd fo… │
│ S07-01    ┆ 36.0   ┆ 37.15 ┆ null       ┆ Perv K-spr altd zone with qtz-… │
└───────────┴────────┴───────┴────────────┴─────────────────────────────────┘
>>> df = pl.read_excel(partial_path, read_options={'schema_sample_rows':1,'dtypes':{'from_m':'string'}})
>>> with pl.Config(set_tbl_rows=30): print(df)
shape: (29, 5)
┌───────────┬────────┬───────┬────────────┬─────────────────────────────────┐
│ hole_name ┆ from_m ┆ to_m  ┆ litho_unit ┆ comments                        │
│ ---       ┆ ---    ┆ ---   ┆ ---        ┆ ---                             │
│ str       ┆ str    ┆ str   ┆ null       ┆ str                             │
╞═══════════╪════════╪═══════╪════════════╪═════════════════════════════════╡
│ S07-01    ┆ 0.0    ┆ 2.2   ┆ null       ┆ overburden                      │
│ S07-01    ┆ 2.2    ┆ 150.0 ┆ null       ┆ Dacitic Ash Flow Tuff (TS)      │
│ S07-01    ┆ 2.2    ┆ 3.0   ┆ null       ┆ Csg. Fresh Dacitic Ash Flow Tu… │
│ S07-01    ┆ 3.0    ┆ 4.0   ┆ null       ┆ Fresh daft.  Bockv fracturino   │
│ S07-01    ┆ 4.0    ┆ 4.15  ┆ null       ┆ 15cm of coarse 3.0-5.0mm feld … │
│ S07-01    ┆ 5.3    ┆ 6.0   ┆ null       ┆ around core - no record         │
│ S07-01    ┆ 6.55   ┆ 6.95  ┆ null       ┆ Silty sand- type material reco… │
│ S07-01    ┆ 7.0    ┆ 7.5   ┆ null       ┆ Bleached, altered dacite, no s… │
│ S07-01    ┆ 8.0    ┆ 9.0   ┆ null       ┆ fresh blocky fracture TS        │
│ S07-01    ┆ 9.9    ┆ 10.0  ┆ null       ┆ Mod. Clav altered TS one thin … │
│ S07-01    ┆ null   ┆ 11.0  ┆ null       ┆ Mod clay altered TS  Blocky br… │
│ S07-01    ┆ null   ┆ 12.0  ┆ null       ┆ very weekly altered TS broken … │
│ S07-01    ┆ null   ┆ 13.8  ┆ null       ┆ 12.0-13.8mostly gr. And lost g… │
│ S07-01    ┆ null   ┆ 14.7  ┆ null       ┆ Mod clay altered zone with qua… │
│ S07-01    ┆ null   ┆ 16.0  ┆ null       ┆ wk-mod clay altered TS          │
│ S07-01    ┆ 16.0   ┆ 17.0  ┆ null       ┆ wk clay altered TS              │
│ S07-01    ┆ 18.2   ┆ 18.3  ┆ null       ┆ Cal healino bx TX at  20-30° C… │
│ S07-01    ┆ 19.1   ┆ 19.95 ┆ null       ┆ Very intensely clay altered TS… │
│ S07-01    ┆ 20.0   ┆ 21.0  ┆ null       ┆ fresh to very weakly clay alte… │
│ S07-01    ┆ 21.1   ┆ 21.95 ┆ null       ┆ Highly broken crumbly core, mo… │
│ S07-01    ┆ 22.0   ┆ 23.0  ┆ null       ┆ Fresh to  v.  weak altered TS   │
│ S07-01    ┆ 23.0   ┆ 24.0  ┆ null       ┆ verv uniform unalterd TS        │
│ S07-01    ┆ 24.0   ┆ 28.0  ┆ null       ┆ uniform unaltered TS            │
│ S07-01    ┆ 28.0   ┆ 29.0  ┆ null       ┆ rock is weaklv oxidized. oranq… │
│ S07-01    ┆ 29.1   ┆ 29.4  ┆ null       ┆ finer grained band of tuff in … │
│ S07-01    ┆ 31.05  ┆ 32.20 ┆ null       ┆ Several lim coated fractures a… │
│ S07-01    ┆ 34.5   ┆ 35.0  ┆ null       ┆ major fault - gg and soft dk g… │
│ S07-01    ┆ 35.0   ┆ 36.0  ┆ null       ┆ soft partly bx, fr and altd fo… │
│ S07-01    ┆ 36.0   ┆ 37.15 ┆ null       ┆ Perv K-spr altd zone with qtz-… │
└───────────┴────────┴───────┴────────────┴─────────────────────────────────┘
>>> df = pl.read_excel(partial_path, read_options={'schema_sample_rows':30,'dtypes':{'from_m':'string'}})
>>> with pl.Config(set_tbl_rows=30): print(df)
shape: (29, 5)
┌───────────┬────────┬───────┬────────────┬─────────────────────────────────┐
│ hole_name ┆ from_m ┆ to_m  ┆ litho_unit ┆ comments                        │
│ ---       ┆ ---    ┆ ---   ┆ ---        ┆ ---                             │
│ str       ┆ str    ┆ str   ┆ null       ┆ str                             │
╞═══════════╪════════╪═══════╪════════════╪═════════════════════════════════╡
│ S07-01    ┆ 0.0    ┆ 2.2   ┆ null       ┆ overburden                      │
│ S07-01    ┆ 2.2    ┆ 150.0 ┆ null       ┆ Dacitic Ash Flow Tuff (TS)      │
│ S07-01    ┆ 2.2    ┆ 3.0   ┆ null       ┆ Csg. Fresh Dacitic Ash Flow Tu… │
│ S07-01    ┆ 3.0    ┆ 4.0   ┆ null       ┆ Fresh daft.  Bockv fracturino   │
│ S07-01    ┆ 4.0    ┆ 4.15  ┆ null       ┆ 15cm of coarse 3.0-5.0mm feld … │
│ S07-01    ┆ 5.3    ┆ 6.0   ┆ null       ┆ around core - no record         │
│ S07-01    ┆ 6.55   ┆ 6.95  ┆ null       ┆ Silty sand- type material reco… │
│ S07-01    ┆ 7.0    ┆ 7.5   ┆ null       ┆ Bleached, altered dacite, no s… │
│ S07-01    ┆ 8.0    ┆ 9.0   ┆ null       ┆ fresh blocky fracture TS        │
│ S07-01    ┆ 9.9    ┆ 10.0  ┆ null       ┆ Mod. Clav altered TS one thin … │
│ S07-01    ┆ null   ┆ 11.0  ┆ null       ┆ Mod clay altered TS  Blocky br… │
│ S07-01    ┆ null   ┆ 12.0  ┆ null       ┆ very weekly altered TS broken … │
│ S07-01    ┆ null   ┆ 13.8  ┆ null       ┆ 12.0-13.8mostly gr. And lost g… │
│ S07-01    ┆ null   ┆ 14.7  ┆ null       ┆ Mod clay altered zone with qua… │
│ S07-01    ┆ null   ┆ 16.0  ┆ null       ┆ wk-mod clay altered TS          │
│ S07-01    ┆ 16.0   ┆ 17.0  ┆ null       ┆ wk clay altered TS              │
│ S07-01    ┆ 18.2   ┆ 18.3  ┆ null       ┆ Cal healino bx TX at  20-30° C… │
│ S07-01    ┆ 19.1   ┆ 19.95 ┆ null       ┆ Very intensely clay altered TS… │
│ S07-01    ┆ 20.0   ┆ 21.0  ┆ null       ┆ fresh to very weakly clay alte… │
│ S07-01    ┆ 21.1   ┆ 21.95 ┆ null       ┆ Highly broken crumbly core, mo… │
│ S07-01    ┆ 22.0   ┆ 23.0  ┆ null       ┆ Fresh to  v.  weak altered TS   │
│ S07-01    ┆ 23.0   ┆ 24.0  ┆ null       ┆ verv uniform unalterd TS        │
│ S07-01    ┆ 24.0   ┆ 28.0  ┆ null       ┆ uniform unaltered TS            │
│ S07-01    ┆ 28.0   ┆ 29.0  ┆ null       ┆ rock is weaklv oxidized. oranq… │
│ S07-01    ┆ 29.1   ┆ 29.4  ┆ null       ┆ finer grained band of tuff in … │
│ S07-01    ┆ 31.05  ┆ 32.20 ┆ null       ┆ Several lim coated fractures a… │
│ S07-01    ┆ 34.5   ┆ 35.0  ┆ null       ┆ major fault - gg and soft dk g… │
│ S07-01    ┆ 35.0   ┆ 36.0  ┆ null       ┆ soft partly bx, fr and altd fo… │
│ S07-01    ┆ 36.0   ┆ 37.15 ┆ null       ┆ Perv K-spr altd zone with qtz-… │
└───────────┴────────┴───────┴────────────┴─────────────────────────────────┘

skytwosea avatar Nov 29 '24 17:11 skytwosea