fastexcel icon indicating copy to clipboard operation
fastexcel copied to clipboard

`use_columns` silently skips empty columns at the beginning

Open jsarbach opened this issue 1 year ago • 6 comments

Given the following spreadsheet with an empty first column: image

Loading the sheet with e.g. use_columns='B:C' (or use_columns=[1, 2]) actually selects columns C and D:

read_excel('sheet.xlsx').load_sheet(0, use_columns='B:C').to_polars()
shape: (3, 2)
┌────────┬───────┐
│ Second ┆ Third │
│ ---    ┆ ---   │
│ f64    ┆ f64   │
╞════════╪═══════╡
│ 4.0    ┆ 7.0   │
│ 5.0    ┆ 8.0   │
│ 6.0    ┆ 9.0   │
└────────┴───────┘

And raises ColumnNotFoundError with use_columns='B:D' or use_columns=[1, 2, 3]:

read_excel('sheet.xlsx').load_sheet(0, use_columns='B:D').to_polars()
_fastexcel.ColumnNotFoundError: column at index 3 not found
Context:
    0: available columns are: [ColumnInfo { name: "First", index: 0, dtype: Float, column_name_from: LookedUp, dtype_from: Guessed }, ColumnInfo { name: "Second", index: 1, dtype: Float, column_name_from: LookedUp, dtype_from: Guessed }, ColumnInfo { name: "Third", index: 2, dtype: Float, column_name_from: LookedUp, dtype_from: Guessed }]

jsarbach avatar Oct 15 '24 12:10 jsarbach

Yes the engine behind fastexcel calamine skips empty columns by default. I plan to add an option on calamine directly like I did for skipping empty rows

PrettyWood avatar Oct 15 '24 13:10 PrettyWood

Yes the engine behind fastexcel calamine skips empty columns by default. I plan to add an option on calamine directly like I did for skipping empty rows

@PrettyWood: If you add it anytime soon I'll hook into it from Polars, as we have a similar request on our side now, and I'm adding a drop_empty_cols param to influence this behaviour (where the underlying engine has support) :)

alexander-beedie avatar Dec 23 '24 18:12 alexander-beedie

+1

dev-artisan avatar Mar 18 '25 07:03 dev-artisan

+1

DasPoet avatar Apr 28 '25 15:04 DasPoet

Would this change be a lot of work?

noctuid avatar May 07 '25 18:05 noctuid

I think this qualifies as a real bug when asking for columns B and C yields columns C and D... Please allow load_sheet to set the underlying skip_empty_area parameter to false.

Quoting the python-calamine README :

By default, calamine skips empty rows/cols before data. For suppress this behaviour, set skip_empty_area to False.

ruomad avatar Oct 12 '25 11:10 ruomad