pycel
pycel copied to clipboard
Large excel files are slow to open
Hello!
I just wanted to first clarify that we'd be open to paying contributors to help optimize this for us. Please let me know if this is of interest. If not, we are looking for direction/suggestions on how to fix ourselves. Thank you! 🙏
What actually happened
When we try to open a large spreadsheet (20+ MB) using pycel it can take up to 8 minutes to startup. Many of the large sheets are static lookup tables.
What was expected to happen
Similar to opening a static csv in memory and doing a look up, I would expect opening a file this large with static data to open much faster. Ideally under 10 seconds so that we can iterate faster during development.
Problem description
When a large file like this takes this long to open, it makes iterating/making changes/debugging as a developer extremely painful.
Code Sample
https://github.com/bauerjon/slow-pycel-example
Environment
pycel==1.0b30 Python 3.9.5 Mac OS
I did find that at least half the slowness comes from the load_workbook
calls
https://github.com/dgorissen/pycel/blob/f4fd7e5e9feb77e5affe9fd3b1881ef47861102c/src/pycel/excelwrapper.py#L243-L245
We added a fork/solution that seems to be working for our use case here