PhpSpreadsheet
PhpSpreadsheet copied to clipboard
Xlsx Reader Optionally Ignore Rows With No Cells
Fix #3982. A number of issues submitted about Xlsx read performance have a common theme, namely that row 1,048,576 and a few rows before it are defined in the worksheet Xml with no cells attached to them. These might be the work of a third party product. While these extraneous rows do not cause any problems for the cells that are actually used on the worksheet, they can lead to excessive memory use. This PR provides an option for the application to ignore rows with no cells when loading.
Recent changes to the load logic had already made a significant difference to memory consumption and load time. For the spreadsheet attached to issue 3982, which had caused out-of-memory errors on the user's system, peak memory usage was already reduced to 40-odd MB. With the new option, this is drastically reduced again, to just over 9MB. Specifying the new option is very easy:
$reader->setIgnoreRowsWithNoCells(true);
Note that there are cases where you might not want this (non-default) behavior. For example, if you set a row height on a row with no cells, the height would be lost with this option. Unfortunately, the extraneous row definitions in the problematic spreadsheets claim to have a custom height, so I can't just use "no custom row styles" as an additional filter.
This is:
- [ ] a bugfix
- [x] a new feature
- [ ] refactoring
- [ ] additional unit tests
Checklist:
- [ ] Changes are covered by unit tests
- [x] Changes are covered by existing unit tests
- [x] New unit tests have been added
- [x] Code style is respected
- [x] Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
- [ ] CHANGELOG.md contains a short summary of the change and a link to the pull request if applicable
- [ ] Documentation is updated as necessary
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate). If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.