pMuTT icon indicating copy to clipboard operation
pMuTT copied to clipboard

xlrd, used by Pandas to read Excel files no longer supports .xlsx Excel workbook files

Open wittregr opened this issue 4 years ago • 4 comments

Version of pMuTT pmutt 1.2.21

Describe the bug Version 2.0+ of xlrd no longer supports reading Excel .xlxs files. This is the default Excel workbook file for current Excel version. Pandas uses xlrd to read Excel files. Since current versions of Excel use the .xlsx format reading Excel sheets with pmut i/o fails.

To Reproduce conda install xlrd (Will install v 2.0.1 which does not support .xlsx files) use pmutt to read data from a spreadsheet

Additional context Short term work arrounds:

  1. Save Excel spreadsheets using the Excel 97-2003 Workbook format. This will save in .xls format and should still be readable
  2. Install an older version of xlrd. conda install xlrd=1.2.0 There is a warning that this could introduce a security issue but it will continue to read .xlsx files.

wittregr avatar Dec 16 '20 21:12 wittregr

Looks like Pandas developers suggested to downgrade xlrd.

We can update the setup file to use the last working version.

jonlym avatar Dec 16 '20 21:12 jonlym

It might also be useful to lock the xlrd version to 1.2.0 to avoid accidentally updating it to a newer version. Add a file named "pinned" to your conda-meta folder (Usually in your Anaconda3 folder) with the line:

xlrd ==1.2.0

This will prevent any updates from updating xlrd to a newer version.

wittregr avatar Dec 30 '20 16:12 wittregr

another option that worked for me is to specify engine='openpyxl' in the pd.read_excel call for .xlsx and later spreadsheets-- this shouldn't be necessary and will add in the complexity of trying to figure out in advance whether the spreadsheet you are trying to open will be .xls or .xlsx but if you're expecting a consistent file type this is another possible workaround until someone fixes pd.read_excel to pick the correct engine based on file extension.

hansgilead avatar Dec 31 '20 17:12 hansgilead

That's a great suggestion, @hansgilead! Our users will probably only use 'xlsx' so this is a much more elegant solution than forcing users to use a certain version of xlrd.

@wittregr, I'll test this with a couple of our examples and make a new pull request.

jonlym avatar Jan 02 '21 00:01 jonlym