calamine icon indicating copy to clipboard operation
calamine copied to clipboard

Hitting dimension arrays of non-standard size, 16 instead of 14

Open sftse opened this issue 1 month ago • 11 comments

[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 99, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 221, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 99, 1, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 221, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 99, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 125, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 221, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[2025-10-23T09:42:18Z DEBUG calamine::xls] parse_dimensions: [0, 0, 0, 0, 123, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]

Similar to #572, wrong worksheet dimensions should not be much of an obstacle to reading the rest of the information from the sheet.

Could make it on-demand just like reading the VBA.

sftse avatar Oct 30 '25 10:10 sftse

Thanks. Do you have a test file that demonstrates this issue?

jmcnamara avatar Oct 30 '25 11:10 jmcnamara

Unfortunately not, client data. Comparing to other dimension arrays from files that do not have issues, the numbers seem to line up and have sensible values, just the two extra 0s are out of place.

sftse avatar Oct 30 '25 13:10 sftse

Unfortunately not, client data.

If you tell me which of the field(s) are incorrect I can create a test file. Or you could maybe dbg!() the errant DIMENSIONS record without printing any other sensitive data. That would be enough to create a test file.

jmcnamara avatar Oct 30 '25 13:10 jmcnamara

If we strip the last two 0s from the arrays, we get 14 bytes that match spec-conformant Dimensions.

The first three examples parsed that way are then

Dimensions { start: (0, 0), end: (354, 0) }
Dimensions { start: (0, 0), end: (476, 0) }
Dimensions { start: (0, 0), end: (354, 4) }

Seems like reasonable values. Here are some from other files from the test set

Dimensions { start: (0, 0), end: (54, 6) }
Dimensions { start: (0, 0), end: (0, 0) }
Dimensions { start: (0, 1), end: (61, 27) }
Dimensions { start: (0, 0), end: (191, 74) }
Dimensions { start: (2, 0), end: (264, 9) }

sftse avatar Oct 30 '25 16:10 sftse

If we strip the last two 0s from the arrays, we get 14 bytes that match spec-conformant Dimensions.

Ok. Got it. This is definitely outside the spec for any Biff version of Excel.

And is the length field for these DIMENSIONS records 0x0010 = 16? Also, is the file readible by Excel?

jmcnamara avatar Oct 30 '25 17:10 jmcnamara

Here is a test file:

gh577.xls

@sftse Could you check if this is the same as your issue.

This isn't readible by Excel.

And an example:

use calamine::{open_workbook, Error, Xls};

fn main() -> Result<(), Error> {
    let path = "tests/gh577.xls";

    let _excel: Xls<_> = open_workbook(path).expect(path);

    Ok(())
}

And the error message:

$ cargo run --example gh577
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
     Running `target/debug/examples/gh577`

thread 'main' panicked at examples/gh577.rs:6:46:
tests/gh577.xls: Len { expected: 14, found: 16, typ: "dimensions" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

jmcnamara avatar Oct 30 '25 17:10 jmcnamara

This hits the same error, but if Excel cannot read it will further investigate, maybe the root cause is something else.

sftse avatar Oct 31 '25 10:10 sftse

but if Excel cannot read it will further investigate,

Just to note that Excel may not be able to read my test file because I used a program that generates xls files and I modified the code to create a DIMENSIONS field with 4 bytes of reserved data instead of the specified 2. However, this may have affected the worksheet offset calculations within the file. And that may be the reason that Excel cannot read it. Is Excel able to read your 16 byte DIMENSIONS file?

jmcnamara avatar Oct 31 '25 11:10 jmcnamara

Is Excel able to read your 16 byte DIMENSIONS file?

Have not checked but that was what I was going to look into.

sftse avatar Oct 31 '25 11:10 sftse

Excel is able to read the file that triggered this issue without warnings. If I open the test file gh577.xls in the online version of excel it opens with a warning that it had to be repaired.

Further of note, libreoffice opens gh577.xls without any warning.

sftse avatar Nov 03 '25 13:11 sftse

So, overall gh577.xls should be acceptable as a test input for this issue, is that correct?

jmcnamara avatar Nov 03 '25 13:11 jmcnamara