Panic while reading xls
Hi When I try to open a workbook on one of my xls files I get the following error.
fn main() -> Result<()> {
let mut workbook: Xls<_> = open_workbook("bla.xls")?;
Ok(())
}
thread 'main' panicked at calamine-0.26.1/src/cfb.rs:362:9:
assertion `left == right` failed: i=3062, len=4334
left: 0
right: 3
stack backtrace:
0: 0x7f7c0cfbab35 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h358afad87e02ca76
1: 0x7f7c0cfef77b - core::fmt::write::hb19b5b269a2fe458
2: 0x7f7c0cfb8c1f - std::io::Write::write_fmt::he5a92676a45ef09d
3: 0x7f7c0cfbbc81 - std::panicking::default_hook::{{closure}}::h3bff550b24d93725
4: 0x7f7c0cfbb95c - std::panicking::default_hook::hd53b1b06d2b99687
5: 0x7f7c0cfbc251 - std::panicking::rust_panic_with_hook::h9fdd87cddb2763da
6: 0x7f7c0cfbc147 - std::panicking::begin_panic_handler::{{closure}}::h089783ab6b5cba45
7: 0x7f7c0cfbaff9 - std::sys::backtrace::__rust_end_short_backtrace::hed34776d77ef7922
8: 0x7f7c0cfbbdd4 - rust_begin_unwind
9: 0x7f7c0ced69d3 - core::panicking::panic_fmt::h300583f35f37447a
10: 0x7f7c0ced6dcf - core::panicking::assert_failed_inner::hafb0e3d63cb01ba6
11: 0x7f7c0ced3dcf - core::panicking::assert_failed::h5549a7e67ae6daf2
12: 0x7f7c0cf7f447 - calamine::cfb::decompress_stream::he271a08c0ffe52cb
13: 0x7f7c0cf0afe7 - <alloc::vec::into_iter::IntoIter<T,A> as core::iter::traits::iterator::Iterator>::try_fold::hdf21cd55dade2c87
14: 0x7f7c0ceedac1 - alloc::vec::in_place_collect::from_iter_in_place::h7d46edc2f8f2af80
15: 0x7f7c0ceea3b9 - <alloc::collections::btree::map::BTreeMap<K,V> as core::iter::traits::collect::FromIterator<(K,V)>>::from_iter::hc1b6dd616abfaa91
16: 0x7f7c0cf03bf2 - calamine::vba::VbaProject::from_cfb::h275adceebcd448df
17: 0x7f7c0cee6b0d - calamine::xls::Xls<RS>::new_with_options::hed3b834fcadb6634
18: 0x7f7c0cee63ec - calamine::open_workbook::hfc42529ea9511aa1
19: 0x7f7c0cf0461f - pontus::main::hb4eed906d3c6d221
20: 0x7f7c0cf10023 - std::sys::backtrace::__rust_begin_short_backtrace::h998b547d7489787b
21: 0x7f7c0cefbc8d - std::rt::lang_start::{{closure}}::h2e2caad5b5a6f960
22: 0x7f7c0cfb3af7 - std::rt::lang_start_internal::h93b3b742566fb30c
23: 0x7f7c0cf06255 - main
Does this mean that the excel file is broken or why is it crashing?
It seems that the VBA script was the issue, I removed it and now the xls file parses without issue.
Can you provide a test case to investigate the issue?
Yes sure, I will see if I can remove everything sensitive in the xls and upload it here.
I am trying to remove all the confidential information, but is not easy. But the issue seems to be related to whitespace in VBA script.
If I add a newline in the VBA script the parsing will succeed every time, but if I add too many newlines I get the error at line 362 in cfb.rs consistently. There are two different failing behaviors, either it fails and left is always 2 or it fails and left is always 0.
If the issue is solely in the VBA script, it should be feasible to extract it using the rust-cfb crate into a separate CFB file. If after doing that calamine fails at the same line, this might help narrow down the cause.
Okay thanks, I will try that. I will also keep working on getting an example xls w/o confidential info..
@sftse Can you give example on how to do that using rust-cfb?
fn main() -> Result<(), Box<dyn std::error::Error>> {
let in_path = "foo.xls";
let out_path = "bar.xls";
let mut original = cfb::open(in_path)?;
let version = original.version();
let out_file = File::create(out_path).unwrap();
let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
let mut stream_paths = Vec::<std::path::PathBuf>::new();
for entry in original.walk() {
if entry.path().to_str().unwrap().contains("VBA") {
if entry.is_storage() {
if !entry.is_root() {
duplicate.create_storage(entry.path())?;
}
duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
} else {
stream_paths.push(entry.path().to_path_buf());
}
}
}
for path in stream_paths.iter() {
std::io::copy(
&mut original.open_stream(path)?,
&mut duplicate.create_new_stream(path)?,
)?;
}
Ok(())
}
Thanks for that source code. I ran basically the same but added the xls read from calamine afterwards and it still gives the same error,
use std::fs::File;
use calamine::{open_workbook, Xls};
fn main() -> Result<(), Box<dyn std::error::Error>> {
env_logger::init();
let in_path = "foo.xls";
let out_path = "bar.xls";
let mut original = cfb::open(in_path)?;
let version = original.version();
let out_file = File::create(out_path).unwrap();
let mut duplicate = cfb::CompoundFile::create_with_version(version, out_file)?;
let mut stream_paths = Vec::<std::path::PathBuf>::new();
for entry in original.walk() {
if entry.path().to_str().unwrap().contains("VBA") {
if entry.is_storage() {
if !entry.is_root() {
duplicate.create_storage(entry.path())?;
}
duplicate.set_storage_clsid(entry.path(), entry.clsid().clone())?;
} else {
stream_paths.push(entry.path().to_path_buf());
}
}
}
for path in stream_paths.iter() {
std::io::copy(
&mut original.open_stream(path)?,
&mut duplicate.create_new_stream(path)?,
)?;
}
let _: Xls<_> = open_workbook("bar.xls")?;
Ok(())
}
assertion `left == right` failed: i=3062, len=4338
left: 0
right: 3
This reduced cfb file should have most of the sensitive parts removed. If you look through the remaining information and find it acceptable to publish, can you upload it as a test case?
Without a reproducible test file the issue will have to be closed as "can't fix".
Closing due to a lack of a reproducible file. Will reopen/reinvestigate if a file is provided.