fix: Prevent panic in TLS parser when bounds are invalid
The TLS parser would panic when encountering invalid bounds in the parse_with_opts function, particularly when analyzing malformed binaries like 02179f3ba93663074740b5c0d283bae2.
This commit adds safety checks to prevent panics when:
- The start_address_of_raw_data and end_address_of_raw_data pointers result in offsets beyond the byte slice length
- Reading callbacks encounters invalid memory regions
Instead of panicking, the code now logs warnings and continues with partial TLS data when possible, making the parser more resilient against malformed files while still providing diagnostic information.
This fix ensures the analyzer can process suspicious binaries without crashing.
sorry, you'll have to rebase; i suggest:
git pull --rebase
and then rebase onto my master. then will have to unfortunately fix the conflicts :(
Thanks for the feedback.
You're absolutely right — we shouldn’t be using println! or eprintln! in library code, which is why they were replaced with debug! and error! respectively.
As for the panic: you're correct that most of the code paths already use ? to propagate errors, but the issue occurs in specific edge cases within the parse_with_opts function in the TLS parser when analyzing malformed or suspicious binaries (02179f3ba93663074740b5c0d283bae2)
What's causing the panic:
start_address_of_raw_data and end_address_of_raw_data can point beyond the actual binary buffer.
When slicing with these values (e.g. bytes[offset..offset + size]), Rust panics due to an out-of-bounds access — this isn't captured by Result or ?, hence the crash.
This PR adds explicit bounds checks, and instead of panicking, it:
Emits an error! log message when invalid data is detected.
Skips over invalid TLS structures while continuing parsing.
Tries to return partial TLS info where safe.
This makes the parser resilient against malformed inputs (e.g. fuzzed or malicious binaries) and prevents total crashes during analysis, without compromising normal parsing.
Also please rebase the CI is not running and it’s fixed in latest master
Nvm CI is running, but is failing because rustfmt was not run
Sorry for interruption, the TLS code contributor here.
I can confirm your sample 02179f3ba93663074740b5c0d283bae2.exe_ is triggering the panic in question. And sorry for causing issues; it's on me. This can be fixed in a couple lines of changes I pasted below.
Since I was aware of this case where start_address_of_raw_data may only be present in mapped PE, it natively supports absense of the field when it's the case. The binary in question is not really a "malformed," from TLS perspective, ~~but rather a great linker behavior to decrease the size of binary in disk.~~ Nvm, it is packed binary, not a linker work.
I hope it helps 🙏
diff --git a/src/pe/tls.rs b/src/pe/tls.rs
index c19e5d3..bda7f3e 100644
--- a/src/pe/tls.rs
+++ b/src/pe/tls.rs
@@ -236,12 +236,10 @@ impl<'a> TlsData<'a> {
})?;
if offset > bytes.len() || offset_end > bytes.len() {
- return Err(error::Error::Malformed(format!(
- "tls raw data offset ({:#x}) and size ({:#x}) greater than byte slice len ({:#x})",
- offset, size, bytes.len()
- )));
+ raw_data = None;
+ } else {
+ raw_data = Some(&bytes[offset..offset + size as usize]);
}
- raw_data = Some(&bytes[offset..offset + size as usize]);
}
// Parse the index if any
`diff --git a/src/pe/relocation.rs b/src/pe/relocation.rs
index c418fb0..9b2f593 100644 --- a/src/pe/relocation.rs +++ b/src/pe/relocation.rs @@ -339,28 +339,35 @@ impl<'a> RelocationData<'a> { file_alignment: u32, opts: &options::ParseOptions, ) -> error::Result<Self> {
-
let offset = -
utils::find_offset(dd.virtual_address as usize, sections, file_alignment, opts) -
.ok_or_else(|| { -
"Cannot map base reloc rva {:#x} into offset", -
dd.virtual_address -
)) -
})?;
-
let offset = utils::find_offset(dd.virtual_address as usize, sections, file_alignment, opts) -
.ok_or_else(|| { -
error::Error::Malformed(format!( -
"Cannot map base reloc rva {:#x} into offset", -
dd.virtual_address -
)) -
})?; -
let max_size = bytes.len().saturating_sub(offset); -
let size_to_use = std::cmp::min(dd.size as usize, max_size); -
if dd.size as usize > max_size { -
log::warn!("Truncated base reloc: offset {:x} + size {:x} > file size {:x}", offset, dd.size, bytes.len()); -
return Ok(Self { bytes: &[] }); -
} -
let bytes = bytes[offset..]
-
.pread::<&[u8]>(dd.size as usize)
-
.pread::<&[u8]>(size_to_use) .map_err(|_| { error::Error::Malformed(format!(
-
"base reloc offset {:#x} and size {:#x} exceeds the bounds of the bytes size {:#x}", -
offset, -
dd.size, -
bytes.len()
-
"Could not read {} bytes at offset {:#x}", -
size_to_use, offset )) })?;
:
-
"Cannot map base reloc rva {:#x} into offset", -
dd.virtual_address -
)) -
})?;
-
let offset = utils::find_offset(dd.virtual_address as usize, sections, file_alignment, opts) -
.ok_or_else(|| { -
error::Error::Malformed(format!( -
"Cannot map base reloc rva {:#x} into offset", -
dd.virtual_address -
)) -
})?; -
let max_size = bytes.len().saturating_sub(offset); -
let size_to_use = std::cmp::min(dd.size as usize, max_size); -
if dd.size as usize > max_size { -
log::warn!("Truncated base reloc: offset {:x} + size {:x} > file size {:x}", offset, dd.size, bytes.len()); -
return Ok(Self { bytes: &[] }); -
} -
let bytes = bytes[offset..]
-
.pread::<&[u8]>(dd.size as usize)
-
.pread::<&[u8]>(size_to_use) .map_err(|_| { error::Error::Malformed(format!(
-
"base reloc offset {:#x} and size {:#x} exceeds the bounds of the bytes size {:#x}", -
offset, -
dd.size, -
bytes.len()
-
"Could not read {} bytes at offset {:#x}", -
} ` Thanks for the detailed explanation and context 🙏size_to_use, offset )) })?; Ok(Self { bytes })
If you think there's a better way to handle this, I'm totally open to suggestions — this is just the solution I came up with that allowed me to run capa on the sample without any runtime errors.
The idea is to preserve forward progress in the analysis, even if the base reloc section can't be read due to it being outside the file bounds (as is the case with this packed sample). Returning an empty slice felt like the least intrusive option.
If you come up with a better approach that still allows execution without returning an error, just let me know!
@jorgeaduran Hi there.
FYI; The basereloc parser bug will be fixed in #465 very soon. :)
Hi @kkent030315, thanks for the information about the fix coming in PR #465. I'll keep my PR open until that version is merged and released, as this functionality is necessary for analyzing files with capa-rs. Appreciate the heads-up!
does this still need to be merged and reviewed? I see another PR credited this PR with a simple fix for at least one of the things this fixes? (Just triaging open PRs)
going to close this now as I believe the fix was merged in #465 ; feel free to reopen if this is not true