calamine icon indicating copy to clipboard operation
calamine copied to clipboard

Panic while reading xls

Open prokie opened this issue 1 year ago • 10 comments

Hi When I try to open a workbook on one of my xls files I get the following error.

fn main() -> Result<()> {
    let mut workbook: Xls<_> = open_workbook("bla.xls")?;
    Ok(())
}
thread 'main' panicked at calamine-0.26.1/src/cfb.rs:362:9:
assertion `left == right` failed: i=3062, len=4334
  left: 0
 right: 3
stack backtrace:
   0:     0x7f7c0cfbab35 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h358afad87e02ca76
   1:     0x7f7c0cfef77b - core::fmt::write::hb19b5b269a2fe458
   2:     0x7f7c0cfb8c1f - std::io::Write::write_fmt::he5a92676a45ef09d
   3:     0x7f7c0cfbbc81 - std::panicking::default_hook::{{closure}}::h3bff550b24d93725
   4:     0x7f7c0cfbb95c - std::panicking::default_hook::hd53b1b06d2b99687
   5:     0x7f7c0cfbc251 - std::panicking::rust_panic_with_hook::h9fdd87cddb2763da
   6:     0x7f7c0cfbc147 - std::panicking::begin_panic_handler::{{closure}}::h089783ab6b5cba45
   7:     0x7f7c0cfbaff9 - std::sys::backtrace::__rust_end_short_backtrace::hed34776d77ef7922
   8:     0x7f7c0cfbbdd4 - rust_begin_unwind
   9:     0x7f7c0ced69d3 - core::panicking::panic_fmt::h300583f35f37447a
  10:     0x7f7c0ced6dcf - core::panicking::assert_failed_inner::hafb0e3d63cb01ba6
  11:     0x7f7c0ced3dcf - core::panicking::assert_failed::h5549a7e67ae6daf2
  12:     0x7f7c0cf7f447 - calamine::cfb::decompress_stream::he271a08c0ffe52cb
  13:     0x7f7c0cf0afe7 - <alloc::vec::into_iter::IntoIter<T,A> as core::iter::traits::iterator::Iterator>::try_fold::hdf21cd55dade2c87
  14:     0x7f7c0ceedac1 - alloc::vec::in_place_collect::from_iter_in_place::h7d46edc2f8f2af80
  15:     0x7f7c0ceea3b9 - <alloc::collections::btree::map::BTreeMap<K,V> as core::iter::traits::collect::FromIterator<(K,V)>>::from_iter::hc1b6dd616abfaa91
  16:     0x7f7c0cf03bf2 - calamine::vba::VbaProject::from_cfb::h275adceebcd448df
  17:     0x7f7c0cee6b0d - calamine::xls::Xls<RS>::new_with_options::hed3b834fcadb6634
  18:     0x7f7c0cee63ec - calamine::open_workbook::hfc42529ea9511aa1
  19:     0x7f7c0cf0461f - pontus::main::hb4eed906d3c6d221
  20:     0x7f7c0cf10023 - std::sys::backtrace::__rust_begin_short_backtrace::h998b547d7489787b
  21:     0x7f7c0cefbc8d - std::rt::lang_start::{{closure}}::h2e2caad5b5a6f960
  22:     0x7f7c0cfb3af7 - std::rt::lang_start_internal::h93b3b742566fb30c
  23:     0x7f7c0cf06255 - main

Does this mean that the excel file is broken or why is it crashing?

prokie avatar Nov 29 '24 15:11 prokie

It seems that the VBA script was the issue, I removed it and now the xls file parses without issue.

prokie avatar Nov 30 '24 11:11 prokie

Can you provide a test case to investigate the issue?

sftse avatar Dec 02 '24 10:12 sftse

Yes sure, I will see if I can remove everything sensitive in the xls and upload it here.

prokie avatar Dec 02 '24 12:12 prokie

I am trying to remove all the confidential information, but is not easy. But the issue seems to be related to whitespace in VBA script.

If I add a newline in the VBA script the parsing will succeed every time, but if I add too many newlines I get the error at line 362 in cfb.rs consistently. There are two different failing behaviors, either it fails and left is always 2 or it fails and left is always 0.

prokie avatar Dec 09 '24 09:12 prokie

If the issue is solely in the VBA script, it should be feasible to extract it using the rust-cfb crate into a separate CFB file. If after doing that calamine fails at the same line, this might help narrow down the cause.

sftse avatar Dec 09 '24 10:12 sftse

Okay thanks, I will try that. I will also keep working on getting an example xls w/o confidential info..

prokie avatar Dec 09 '24 13:12 prokie

@sftse Can you give example on how to do that using rust-cfb?

prokie avatar Dec 10 '24 06:12 prokie

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let in_path = "foo.xls";
    let out_path = "bar.xls";
    let mut original = cfb::open(in_path)?;
    let version = original.version();
    let out_file = File::create(out_path).unwrap();
    let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
    let mut stream_paths = Vec::<std::path::PathBuf>::new();
    for entry in original.walk() {
        if entry.path().to_str().unwrap().contains("VBA") {
            if entry.is_storage() {
                if !entry.is_root() {
                    duplicate.create_storage(entry.path())?;
                }
                duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
            } else {
                stream_paths.push(entry.path().to_path_buf());
            }
        }
    }
    for path in stream_paths.iter() {
        std::io::copy(
            &mut original.open_stream(path)?,
            &mut duplicate.create_new_stream(path)?,
        )?;
    }
    Ok(())
}

sftse avatar Dec 12 '24 11:12 sftse

Thanks for that source code. I ran basically the same but added the xls read from calamine afterwards and it still gives the same error,

use std::fs::File;

use calamine::{open_workbook, Xls};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    env_logger::init();
    let in_path = "foo.xls";
    let out_path = "bar.xls";
    let mut original = cfb::open(in_path)?;
    let version = original.version();
    let out_file = File::create(out_path).unwrap();
    let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
    let mut stream_paths = Vec::<std::path::PathBuf>::new();
    for entry in original.walk() {
        if entry.path().to_str().unwrap().contains("VBA") {
            if entry.is_storage() {
                if !entry.is_root() {
                    duplicate.create_storage(entry.path())?;
                }
                duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
            } else {
                stream_paths.push(entry.path().to_path_buf());
            }
        }
    }
    for path in stream_paths.iter() {
        std::io::copy(
            &mut original.open_stream(path)?,
            &mut duplicate.create_new_stream(path)?,
        )?;
    }
    let _: Xls<_> = open_workbook("bar.xls")?;

    Ok(())
}
assertion `left == right` failed: i=3062, len=4338
  left: 0
 right: 3

prokie avatar Dec 18 '24 14:12 prokie

This reduced cfb file should have most of the sensitive parts removed. If you look through the remaining information and find it acceptable to publish, can you upload it as a test case?

sftse avatar Dec 18 '24 15:12 sftse

Without a reproducible test file the issue will have to be closed as "can't fix".

jmcnamara avatar Jun 30 '25 23:06 jmcnamara

Closing due to a lack of a reproducible file. Will reopen/reinvestigate if a file is provided.

jmcnamara avatar Jul 04 '25 14:07 jmcnamara