gramex icon indicating copy to clipboard operation
gramex copied to clipboard

ENH: Added openpyxl engine in read excel as xlrd stopped support for xlsx

Open naveenreddymanukonda opened this issue 5 years ago • 7 comments

Reading xlsx files through openpyxl engine. As xlrd module stopped supporting xlsx files

naveenreddymanukonda avatar Dec 19 '20 01:12 naveenreddymanukonda

@naveenreddymanukonda -- two inputs:

  • [ ] In case the user specifies the engine explicitly, this can cause an error. For example, if I call gramex.cache.open('data.xlsx', engine='openpyxl'). You may want to use kwargs.setdefault('engine', 'openpyxl') instead.
  • [ ] Is there any file openpyxl can open that xlrd cannot? We can add that as a test case at https://github.com/gramener/gramex/blob/master/testlib/test_cache_module.py

sanand0 avatar Dec 19 '20 03:12 sanand0

@sanand0 openpyxl doesn't support xls format based on the extension of the file will set the default engine. (link)

Please let me know your thoughts.

naveenreddymanukonda avatar Dec 21 '20 01:12 naveenreddymanukonda

Let’s use xlrd as the default for XLS files, if it supports it. Is that OK?

On Mon, 21 Dec 2020 at 7:13 AM, naveenreddymanukonda < [email protected]> wrote:

@sanand0 https://github.com/sanand0 openpyxl doesn't support xls format based on the extension of the file will set the default engine. Please let me know your thoughts.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/gramener/gramex/pull/351#issuecomment-748712114, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPR5SIXKI566RT445UBK3SV2R37ANCNFSM4VB47ZDQ .

-- Thanks Anand

sanand0 avatar Dec 21 '20 12:12 sanand0

@sanand0

excel_format = io.split('.')[-1]
if excel_format == 'xls':
    kwargs.setdefault('engine', 'xlrd')
elif excel_format == 'xlsx':
    kwargs.setdefault('engine', 'openpyxl')

Can we go with the above code

naveenreddymanukonda avatar Dec 21 '20 12:12 naveenreddymanukonda

Yes, @naveenreddymanukonda. You could consider .endswith() as well. Please remember that the extension can be in lowercase or uppercase or both.

I'll make these changes and merge. No further edits required.

Thank you!

sanand0 avatar Dec 23 '20 11:12 sanand0

@sanand0 thank you

naveenreddymanukonda avatar Dec 23 '20 11:12 naveenreddymanukonda

The update to xlrd requires test cases to be changed as well. Else build fails. I'll work on this.

sanand0 avatar Dec 31 '20 05:12 sanand0

Closing. This has been resolved by gramex.cache.read_excel()

sanand0 avatar Dec 21 '22 11:12 sanand0