goblin icon indicating copy to clipboard operation
goblin copied to clipboard

MIPS64 parse error: "type is too big (1236271128) for 137416"

Open Anniywell opened this issue 4 years ago • 10 comments
trafficstars

Error file

target_mips64: ELF 64-bit LSB pie executable, MIPS, MIPS64 rel2 version 1 (SYSV), dynamically linked, interpreter /lib64/ld.so.1, BuildID[sha1]=bacc6abf4a4687c897f7b49b78a9fcbb710a107c, for GNU/Linux 3.2.0, with debug_info, not stripped

This is my code:

use goblin::{error, Object};
use std::path::Path;
use std::fs;
// use std::io;

fn run () -> error::Result<()> {
    let path = Path::new("./mips64");
    let buffer = fs::read(path)?;
    match Object::parse(&buffer)? {
        Object::Elf(elf) => {
            println!("elf: {:#?}", &elf);
        },
        Object::PE(pe) => {
            println!("pe: {:#?}", &pe);
        },
        Object::Mach(mach) => {
            println!("mach: {:#?}", &mach);
        },
        Object::Archive(archive) => {
            println!("archive: {:#?}", &archive);
        },
        Object::Unknown(magic) => { println!("unknown magic: {:#x}", magic) }
    }
    Ok(())
}

fn main() {
    let r = run();
    let _ = match r {
        Ok(a) => a,
        Err(error) => {
            panic!("{:?}", error.to_string());
        }
    };
   }

Anniywell avatar May 28 '21 09:05 Anniywell

confirmed, thanks for uploading the file! likely something in section headers or program headers says something has a larger size (1.2GB) than the binary itself. It could be zero space issue, though I thought we'd fixed that, or something else is going on. @Anniywell would you like to investigate? A simple way to test, would be to e.g., add some prints into the elf backend then run this:

cargo run --example rdr mips64.txt

So i suspect something is wrong with this calculation:

                let max_reloc_sym = dynrelas.iter()
                    .chain(dynrels.iter())
                    .chain(pltrelocs.iter())
                    .fold(0, |num, reloc| cmp::max(num, reloc.r_sym));
                if max_reloc_sym != 0 {
                    num_syms = cmp::max(num_syms, max_reloc_sym + 1);
                }

which reports there to be 51511296 51 million symbols, which doesn't seem right :)

m4b avatar May 30 '21 18:05 m4b

This is a MIPS quirk, you need something like https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139 to read reloc.r_sym correctly.

Edit: or https://github.com/llvm/llvm-project/blob/119bf57ab6de49a3e61b9200c917a6d30ac6f0ad/llvm/include/llvm/Object/ELFTypes.h#L435-L444

philipc avatar May 31 '21 05:05 philipc

I added some printing, r_info is not getting the right value This is readelf output:

Relocation section '.rel.dyn' at offset 0x1018 contains 5 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000000  0000000000000000 R_MIPS_NONE
                    Type2: R_MIPS_NONE
                    Type3: R_MIPS_NONE
00000000000150f0  0000000000001203 R_MIPS_REL32
                    Type2: R_MIPS_64
                    Type3: R_MIPS_NONE
00000000000150f8  0000000000001203 R_MIPS_REL32
                    Type2: R_MIPS_64
                    Type3: R_MIPS_NONE
00000000000152a8  0000000000001203 R_MIPS_REL32
                    Type2: R_MIPS_64
                    Type3: R_MIPS_NONE
0000000000015110  0000002700001203 R_MIPS_REL32           0000000000000000 __gxx_personality_v0@CXXABI_1.3
                    Type2: R_MIPS_64
                    Type3: R_MIPS_NONE

This is example code output:

dynrels: Reloc { r_offset: 0, r_addend: 0, r_sym: 0, r_type: 0 }
dynrels: Reloc { r_offset: 150f0, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 150f8, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 152a8, r_addend: 0, r_sym: 51511296, r_type: 0 }
dynrels: Reloc { r_offset: 15110, r_addend: 0, r_sym: 51511296, r_type: 39 }
offset: 0x0, info 0x0
offset: 0x150f0, info 0x312000000000000
offset: 0x150f8, info 0x312000000000000
offset: 0x152a8, info 0x312000000000000
offset: 0x15110, info 0x312000000000027

Anniywell avatar Jun 01 '21 02:06 Anniywell

as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently; @Anniywell would you like to submit one ? :)

m4b avatar Jun 01 '21 04:06 m4b

as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently; @Anniywell would you like to submit one ? :) I tried to read the data of .rel.dyn in Python, and found that offset and info byte order are not consistent. The ELF is little endian, but the info part is big endian from the read result. I think this is the reason for the incorrect info value, but I know how to fix it.

In [115]: f.seek(4120)
Out[115]: 4120

In [116]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[116]: '0x0000000000000000: 0x0000000000000000'

In [117]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[117]: '0xf050010000000000: 0x0000000000001203'

In [118]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[118]: '0xf850010000000000: 0x0000000000001203'

In [119]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[119]: '0xa852010000000000: 0x0000000000001203'

In [120]: '0x{}: 0x{}'.format(f.read(8).hex(),f.read(8).hex())
Out[120]: '0x1051010000000000: 0x2700000000001203'

Anniywell avatar Jun 01 '21 06:06 Anniywell

This is a MIPS quirk, you need something like https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139 to read reloc.r_sym correctly.

Edit: or https://github.com/llvm/llvm-project/blob/119bf57ab6de49a3e61b9200c917a6d30ac6f0ad/llvm/include/llvm/Object/ELFTypes.h#L435-L444

Thank you for providing the relevant code of LLVM, which answers my doubts.

Anniywell avatar Jun 02 '21 01:06 Anniywell

as philipc said, someone will have to make a PR adding this functionality for mips quirks apparently;

I'd love to see this issue fixed, @m4b can you elaborate on how to fix this? I'm not sure where to get started.

messense avatar Jul 27 '22 13:07 messense

yes let's get it fixed! so reading over comments from this thread, I think we have enough information for some kind soul to fix this, @messense would you be interested in making the PR?

I believe someone just has to implement this mips quirk when parsing, iiuc, as linked already: https://github.com/gimli-rs/object/blob/832f5277a0774de18f84e7c27401f974e4b6a064/src/elf.rs#L1129-L1139

or inlined:

    pub(crate) fn get_r_info(&self, endian: E, is_mips64el: bool) -> u64 {
        let mut t = self.r_info.get(endian);
        if is_mips64el {
            t = (t << 32)
                | ((t >> 8) & 0xff000000)
                | ((t >> 24) & 0x00ff0000)
                | ((t >> 40) & 0x0000ff00)
                | ((t >> 56) & 0x000000ff);
        }
        t
    }

I'd have to look closely at what the bit shifting logic is doing, I haven't thought deeply about it, maybe there is no logic, but a quirk as philipc said :shrug:

m4b avatar Jul 30 '22 04:07 m4b

I believe someone just has to implement this mips quirk when parsing

Yes, I've read the thread and I understand that. But I'm not sure where to put the code and what's your preferred way to pass down the mips64el architecture information.

messense avatar Aug 02 '22 05:08 messense

diff --git a/src/elf/reloc.rs b/src/elf/reloc.rs
index d8a1df9..58ed236 100644
--- a/src/elf/reloc.rs
+++ b/src/elf/reloc.rs
@@ -253,14 +253,26 @@ pub mod reloc64 {
     pub const SIZEOF_RELA: usize = 8 + 8 + 8;
     pub const SIZEOF_REL: usize = 8 + 8;

+    pub fn get_info(info: u64) -> u64 {
+        let mut t = info;
+        t = (t << 32)
+            | ((t >> 8) & 0xff000000)
+            | ((t >> 24) & 0x00ff0000)
+            | ((t >> 40) & 0x0000ff00)
+            | ((t >> 56) & 0x000000ff);
+        t
+    }
+
     #[inline(always)]
     pub fn r_sym(info: u64) -> u32 {
-        (info >> 32) as u32
+        let trans_info = get_info(info);
+        (trans_info >> 32) as u32
     }

     #[inline(always)]
     pub fn r_type(info: u64) -> u32 {
-        (info & 0xffff_ffff) as u32
+        let trans_info = get_info(info);
+        (trans_info & 0xffff_ffff) as u32
     }

     #[inline(always)]

@m4b maybe we can do this for misp64

wynemo avatar Oct 14 '23 14:10 wynemo