PE: Add delay import parser
Added delay-load import directory parser for PE32/64.
Using Vec as well as existing normal import parser. The design decision behind that Vec is that each import entry requires RVA resolution. I'd say it is not a good design, as both normal and delay import parser runs fully upon goblin::pe::PE::parse that can raise malformed errors -- even if the consumer doesn't need that import info.
I guess almost entire parsing logic can be moved to TryFromCtx if we could build some specific scroll ctx with a hell of lifetimes, so that utils::find_offset became callable inside the TryFromCtx. That's too much hustle but it should theoretically be possible.
/// A binary parsing context for PE parser, including the container size, underlying byte endianness and params for resolving RVAs
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct PeCtx<'a> {
pub container: Container,
pub le: scroll::Endian,
pub sections: &[section_table::SectionTable],
pub file_alignment: u32,
pub opts: &options::ParseOptions,
// full binary view that arbitrary RVA could point to
pub bytes: &'a [u8],
}
impl<'a> Iterator for DelayImportEntryIterator<'a>
impl<'a> Iterator for DelayImportDescriptorIterator<'a>
Ref:
- https://www.hexacorn.com/blog/2019/06/03/playing-with-delay-loaded-dlls/
I made a brief example implementation of my idea to have PeCtx in 2fd0a44. Consider the code an experimental code that aren't ready to ship.
I've added custom Ctx:
/// A binary parsing context for PE parser
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct PeCtx<'a> {
pub container: Container,
pub le: scroll::Endian,
pub sections: &'a [section_table::SectionTable],
pub file_alignment: u32,
pub opts: options::ParseOptions,
pub bytes: &'a [u8], // full binary view
}
The issue is that we cannot acrually borrow the &'a sections in DelayImportData<'a>. We cannot make sections: Vec<section_table::SectionTable> because Ctx must implement Copy that Vec doesn't.
The solution to that problem is to require consumers provide sections when they needed to get delay import info.
impl<'a> DelayImportDll<'a> {
pub fn functions(
&self,
sections: &'a [section_table::SectionTable],
) -> DelayImportFunctionIterator<'a> {
// Replace sections with an actual sections ref
let pectx = PeCtx::new(
self.ctx.container,
self.ctx.le,
§ions,
self.ctx.file_alignment,
self.ctx.opts,
self.ctx.bytes,
);
DelayImportFunctionIterator {
ctx: pectx,
bytes: self.bytes,
offset: 0,
descriptor: self.descriptor,
}
}
}
impl<'a> DelayImportData<'a> {
pub fn dlls(&self, sections: &'a [section_table::SectionTable]) -> DelayImportDllIterator<'a> {
// Replace sections with an actual sections ref
let pectx = PeCtx::new(
self.ctx.container,
self.ctx.le,
§ions,
self.ctx.file_alignment,
self.ctx.opts,
self.ctx.bytes,
);
DelayImportDllIterator {
ctx: pectx,
bytes: self.bytes,
offset: 0,
}
}
}
I think, the best is to have an offset of section table and parsing it when needed. So that we eliminate the odd sections parameter in public APIs.
/// A binary parsing context for PE parser
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct PeCtx<'a> {
pub container: Container,
pub le: scroll::Endian,
- pub sections: &'a [section_table::SectionTable],
+ // let sections = ctx.bytes.pread_with(ctx.sections_offset, scroll::LE)?;
+ pub sections_offset: usize,
pub file_alignment: u32,
pub opts: options::ParseOptions,
pub bytes: &'a [u8], // full binary view
}
Otherwise it looks very clean design for me. It makes parsing of delay imports deferred to when it's needed.
let res = goblin::pe::PE::parse(&buffer)?;
let di = res.delay_import_data.unwrap();
let sections = &res.sections;
let delayimports = di
.dlls(§ions)
.filter_map(Result::ok)
.flat_map(|x| x.functions(§ions))
.filter_map(Result::ok)
.collect::<Vec<_>>();
can you rebase, CI had an issue on nightly
Have not fully read over the PeCtx idea, looks interesting though; few initial notes:
- why is
The issue is that we cannot acrually borrow the &'a sections in DelayImportData<'a>
the case? is it some kind of cycle issue or? 2. for
let pectx = PeCtx::new(
self.ctx.container,
self.ctx.le,
§ions,
self.ctx.file_alignment,
self.ctx.opts,
self.ctx.bytes,
);
you can rewrite this as:
let pectx = PeCtx { sections: §ions, ..self.ctx) }
as long as all fields of PeCtx are accessible in the use context (e.g., public, or within the file if private, module if pub(crate)). In other languages this is called a "functional record update" or something to that effect, ocaml has a similar syntactic mechanism, for example.
the case? is it some kind of cycle issue or?
Nah it's the issue where sections is Vec in PE::parse that cannot outlives. &'a bytes has longer lifetime because it is being passed from consumer.
https://github.com/m4b/goblin/blob/f1b6cae5d5f474ecefc60298e15b90f82b979e39/src/pe/mod.rs#L51-L52 https://github.com/m4b/goblin/blob/f1b6cae5d5f474ecefc60298e15b90f82b979e39/src/pe/mod.rs#L107
sorry this didn't make it in for 0.10.1, but it does need a rebase; i did have some concerns about this, as it looked a little complicated, but i'll just have to re-review again when it's ready. Also I can't remember if it had breaking changes, so if it did it wouldn't have made it into 0.10.1; with this, I think i've caught up to your absolutely massive amount of PRs :D
@m4b This PR needs a direction on which way this PR should go with: A) traditional Vec or B) new PeCtx<'a> and "deferred" iterators. I personally prefer challenging on PeCtx<'a> that might also be applied to import/export parser rework in near future. WDYT?
edit: current patch includes both case.
Once #239 gets merged we could add section_table: &'a [u8] in PeCtx that will get parsed inside FromCtx and such.
/// A binary parsing context for PE parser, including the container size, underlying byte endianness and params for resolving RVAs
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct PeCtx<'a> {
pub container: Container,
pub le: scroll::Endian,
pub section_table: &'a [u8],
pub file_alignment: u32,
pub opts: &options::ParseOptions,
pub bytes: &'a [u8], // full binary view
}
impl<'a> ctx::TryFromCtx<'a, (DelayImportDescriptor, PeCtx<'a>)> for DelayImportFunction<'a> {
type Error = crate::error::Error;
fn try_from_ctx(
bytes: &'a [u8],
ctx: (DelayImportDescriptor, PeCtx<'a>),
) -> error::Result<(Self, usize)> {
let sections_it = SectionTableIterator { // Fused, ExactSize
data: ctx.section_table, // &'a [u8]
};
// Actually doesnt need to alloc here, passing iterator to utils::find_offset_ex directly or smth
let sections: Vec<SectionHeader> = sections_it.collect<Result<Vec<_>>>()?;
// Or even `ctx.section_table.pread::<&[SectionHeader]>(0)` maybe?
let name = if is_ordinal {
None
} else {
let dll_name_rva = descriptor.name_rva;
let dll_name_offset = utils::find_offset(
dll_name_rva as usize,
sections,
ctx.file_alignment,
&ctx.opts,
)
.ok_or_else(|| {
error::Error::Malformed(format!(
"cannot map delay import dll name rva {dll_name_rva:#x}"
))
})?;
let dll_name = ctx.bytes.pread::<&'a str>(dll_name_offset)?;
Some(dll_name)
};
// ...
}
}