django-excel icon indicating copy to clipboard operation
django-excel copied to clipboard

Unable to support XLS and XLSX uploads in Python 3.9

Open craiga opened this issue 4 years ago • 3 comments

On Python 3.9 with pyexcel-xls and pyexcel-xlsx installed, I'm not able to upload .xslx files.

AttributeError: 'ElementTree' object has no attribute 'getiterator'
Saving workbook from spreadsheet.xlsx
Internal Server Error: /portfolios/my-portfolio/upload
Traceback (most recent call last):
    File "/app/.heroku/python/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
        response = get_response(request)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/core/handlers/base.py", line 179, in _get_response
        response = wrapped_callback(request, *callback_args, **callback_kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/sentry_sdk/integrations/django/views.py", line 67, in sentry_wrapped_callback
        return callback(request, *args, **kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/views/generic/base.py", line 70, in view
        return self.dispatch(request, *args, **kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/contrib/auth/mixins.py", line 85, in dispatch
        return super().dispatch(request, *args, **kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/contrib/auth/mixins.py", line 52, in dispatch
        return super().dispatch(request, *args, **kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/views/generic/base.py", line 98, in dispatch
        return handler(request, *args, **kwargs)
    File "/app/.heroku/python/lib/python3.9/site-packages/django/views/generic/edit.py", line 142, in post
        return self.form_valid(form)
    File "/app/portfolios/views/upload_spreadsheets.py", line 40, in form_valid
        result = dict(self.save_files(self.request.FILES.getlist("file")))
    File "/app/portfolios/views/upload_spreadsheets.py", line 55, in save_files
        yield (file.name, dict(self.save_book(file.get_book())))
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_webio/__init__.py", line 203, in get_book
        return pe.get_book(**params)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel/core.py", line 47, in get_book
        book_stream = sources.get_book_stream(**keywords)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel/internal/core.py", line 39, in get_book_stream
        sheets = a_source.get_data()
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel/plugins/sources/memory_input.py", line 40, in get_data
        sheets = self.__parser.parse_file_content(
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel/plugins/parsers/excel.py", line 27, in parse_file_content
        return self._parse_any(
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel/plugins/parsers/excel.py", line 40, in _parse_any
        sheets = get_data(anything, file_type=file_type, **keywords)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_io/io.py", line 86, in get_data
        data, _ = _get_data(
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_io/io.py", line 105, in _get_data
        return load_data(**keywords)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_io/io.py", line 193, in load_data
        reader.open_content(file_content, **keywords)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_io/reader.py", line 58, in open_content
        self.reader = self.reader_class(
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_xls/xlsr.py", line 186, in __init__
        super().__init__(file_type, file_contents=file_content, **keywords)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_xls/xlsr.py", line 146, in __init__
        self.xls_book = self.get_xls_book(**xlrd_params)
    File "/app/.heroku/python/lib/python3.9/site-packages/pyexcel_xls/xlsr.py", line 167, in get_xls_book
        xls_book = xlrd.open_workbook(**xlrd_params)
    File "/app/.heroku/python/lib/python3.9/site-packages/xlrd/__init__.py", line 130, in open_workbook
        bk = xlsx.open_workbook_2007_xml(
    File "/app/.heroku/python/lib/python3.9/site-packages/xlrd/xlsx.py", line 812, in open_workbook_2007_xml
        x12book.process_stream(zflo, 'Workbook')
    File "/app/.heroku/python/lib/python3.9/site-packages/xlrd/xlsx.py", line 266, in process_stream
        for elem in self.tree.iter() if Element_has_iter else self.tree.getiterator():
AttributeError: 'ElementTree' object has no attribute 'getiterator'

pyexcel appears to be preferring pyexcel-xls over pyexcel-xlsx for parsing xlsx files.

pyexcel-xls works fine for reading xls files, but the underlying (and unmaintained) xlrd library's XML parsing seems to rely on a method which has been removed from ElementTree. I haven't looked to far into this, but I did see this in what's new in Python 3.9:

Methods getchildren() and getiterator() of classes ElementTree and Element in the ElementTree module have been removed. They were deprecated in Python 3.2. Use iter(x) or list(x) instead of x.getchildren() and x.iter() or list(x.iter()) instead of x.getiterator(). (Contributed by Serhiy Storchaka in bpo-36543.)

I tried solving this issue at https://github.com/pyexcel/pyexcel-io/pull/99 with no luck.

craiga avatar Nov 16 '20 09:11 craiga

Yep, please update to latest pyexcel-xls

chfw avatar Dec 29 '20 15:12 chfw

Apologies for taking so long to get back to this.

Updating to the latest pyexcel-xls doesn't solve this problem. It's only when we're on the latest version of pyexcel-xls that we see the above error message (if I roll back to the previous version, I get xlrd.biffh.XLRDError: Excel xlsx file; not supported as XLRD is no longer pinned).

As far as I can tell, there are two possible solutions to this issue:

  • remove XLSX support from pyexcel-xls (as this is what XLRD has done it seems like the sensible approach to me)
  • somehow get lml to prefer pyexcel-xlsx for XLSX files (I looked into this but couldn't figure out how to make this happen)

craiga avatar Apr 07 '21 18:04 craiga