goblin
goblin copied to clipboard
MIPS64 parse error: "type is too big (1236271128) for 137416"
target_mips64: ELF 64-bit LSB pie executable, MIPS, MIPS64 rel2 version 1 (SYSV), dynamically linked, interpreter /lib64/ld.so.1, BuildID[sha1]=bacc6abf4a4687c897f7b49b78a9fcbb710a107c, for GNU/Linux 3.2.0, with debug_info, not stripped
This is my code:
use goblin::{error, Object};
use std::path::Path;
use std::fs;
// use std::io;
fn run () -> error::Result<()> {
let path = Path::new("./mips64");
let buffer = fs::read(path)?;
match Object::parse(&buffer)? {
Object::Elf(elf) => {
println!("elf: {:#?}", &elf);
},
Object::PE(pe) => {
println!("pe: {:#?}", &pe);
},
Object::Mach(mach) => {
println!("mach: {:#?}", &mach);
},
Object::Archive(archive) => {
println!("archive: {:#?}", &archive);
},
Object::Unknown(magic) => { println!("unknown magic: {:#x}", magic) }
}
Ok(())
}
fn main() {
let r = run();
let _ = match r {
Ok(a) => a,
Err(error) => {
panic!("{:?}", error.to_string());
}
};
}
confirmed, thanks for uploading the file! likely something in section headers or program headers says something has a larger size (1.2GB) than the binary itself. It could be zero space issue, though I thought we'd fixed that, or something else is going on. @Anniywell would you like to investigate? A simple way to test, would be to e.g., add some prints into the elf backend then run this:
cargo run --example rdr mips64.txt
So i suspect something is wrong with this calculation:
let max_reloc_sym = dynrelas.iter()
.chain(dynrels.iter())
.chain(pltrelocs.iter())
.fold(0, |num, reloc| cmp::max(num, reloc.r_sym));
if max_reloc_sym != 0 {
num_syms = cmp::max(num_syms, max_reloc_sym + 1);
}
which reports there to be 51511296 51 million symbols, which doesn't seem right :)
This is a MIPS quirk, you need something like https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139 to read reloc.r_sym correctly.
Edit: or https://github.com/llvm/llvm-project/blob/119bf57ab6de49a3e61b9200c917a6d30ac6f0ad/llvm/include/llvm/Object/ELFTypes.h#L435-L444
I added some printing, r_info is not getting the right value This is readelf output:
Relocation section '.rel.dyn' at offset 0x1018 contains 5 entries:
Offset Info Type Symbol's Value Symbol's Name
0000000000000000 0000000000000000 R_MIPS_NONE
Type2: R_MIPS_NONE
Type3: R_MIPS_NONE
00000000000150f0 0000000000001203 R_MIPS_REL32
Type2: R_MIPS_64
Type3: R_MIPS_NONE
00000000000150f8 0000000000001203 R_MIPS_REL32
Type2: R_MIPS_64
Type3: R_MIPS_NONE
00000000000152a8 0000000000001203 R_MIPS_REL32
Type2: R_MIPS_64
Type3: R_MIPS_NONE
0000000000015110 0000002700001203 R_MIPS_REL32 0000000000000000 __gxx_personality_v0@CXXABI_1.3
Type2: R_MIPS_64
Type3: R_MIPS_NONE
This is example code output:
dynrels: Reloc { r_offset: 0, r_addend: 0, r_sym: 0, r_type: 0 }
dynrels: Reloc { r_offset: 150f0, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 150f8, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 152a8, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 15110, r_addend: 0, r_sym: 51511296, r_type: 39 }
offset: 0x0, info 0x0
offset: 0x150f0, info 0x312000000000000
offset: 0x150f8, info 0x312000000000000
offset: 0x152a8, info 0x312000000000000
offset: 0x15110, info 0x312000000000027
as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently; @Anniywell would you like to submit one ? :)
as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently; @Anniywell would you like to submit one ? :) I tried to read the data of .rel.dyn in Python, and found that offset and info byte order are not consistent. The ELF is little endian, but the info part is big endian from the read result. I think this is the reason for the incorrect info value, but I know how to fix it.
In [115]: f.seek(4120)
Out[115]: 4120
In [116]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[116]: '0x0000000000000000: 0x0000000000000000'
In [117]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[117]: '0xf050010000000000: 0x0000000000001203'
In [118]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[118]: '0xf850010000000000: 0x0000000000001203'
In [119]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[119]: '0xa852010000000000: 0x0000000000001203'
In [120]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[120]: '0x1051010000000000: 0x2700000000001203'
This is a MIPS quirk, you need something like https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139 to read
reloc.r_symcorrectly.Edit: or https://github.com/llvm/llvm-project/blob/119bf57ab6de49a3e61b9200c917a6d30ac6f0ad/llvm/include/llvm/Object/ELFTypes.h#L435-L444
Thank you for providing the relevant code of LLVM, which answers my doubts.
as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently;
I'd love to see this issue fixed, @m4b can you elaborate on how to fix this? I'm not sure where to get started.
yes let's get it fixed! so reading over comments from this thread, I think we have enough information for some kind soul to fix this, @messense would you be interested in making the PR?
I believe someone just has to implement this mips quirk when parsing, iiuc, as linked already: https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139
or inlined:
pub(crate) fn get_r_info(&self, endian: E, is_mips64el: bool) -> u64 {
let mut t = self.r_info.get(endian);
if is_mips64el {
t = (t << 32)
| ((t >> 8) & 0xff000000)
| ((t >> 24) & 0x00ff0000)
| ((t >> 40) & 0x0000ff00)
| ((t >> 56) & 0x000000ff);
}
t
}
I'd have to look closely at what the bit shifting logic is doing, I haven't thought deeply about it, maybe there is no logic, but a quirk as philipc said :shrug:
I believe someone just has to implement this mips quirk when parsing
Yes, I've read the thread and I understand that. But I'm not sure where to put the code and what's your preferred way to pass down the mips64el architecture information.
diff --git a/src/elf/reloc.rs b/src/elf/reloc.rs
index d8a1df9..58ed236 100644
--- a/src/elf/reloc.rs
+++ b/src/elf/reloc.rs
@@ -253,14 +253,26 @@ pub mod reloc64 {
pub const SIZEOF_RELA: usize = 8 + 8 + 8;
pub const SIZEOF_REL: usize = 8 + 8;
+ pub fn get_info(info: u64) -> u64 {
+ let mut t = info;
+ t = (t << 32)
+ | ((t >> 8) & 0xff000000)
+ | ((t >> 24) & 0x00ff0000)
+ | ((t >> 40) & 0x0000ff00)
+ | ((t >> 56) & 0x000000ff);
+ t
+ }
+
#[inline(always)]
pub fn r_sym(info: u64) -> u32 {
- (info >> 32) as u32
+ let trans_info = get_info(info);
+ (trans_info >> 32) as u32
}
#[inline(always)]
pub fn r_type(info: u64) -> u32 {
- (info & 0xffff_ffff) as u32
+ let trans_info = get_info(info);
+ (trans_info & 0xffff_ffff) as u32
}
#[inline(always)]
@m4b maybe we can do this for misp64