unicorn icon indicating copy to clipboard operation
unicorn copied to clipboard

Rust bindings are not exaustive and do not exploit rust typing

Open amaelFr opened this issue 9 months ago • 10 comments

Hello

I'm a regular user of your fantastic project.

A few month ago I start used it more often and mainly in rust. Where I find out the binding is not exploiting the full capabilities of it (mainly typing). Ex: Registers enum are pass to read/ write function under Into<i32> Moreover the reg_read wrapper don't let us read CP_REG (for exemple under arm64)

if I may suggest you a little work around using a trait

Trait Register {
    fn id(&self) -> i32;

    fn get_val_ptr(&self) -> *mut core::ffi::c_void {
        let val:alloc::boxed::Box<u64> = alloc::boxed::Box::new(0u64);
        alloc::boxed::Box::into_raw(val) as *mut core::ffi::c_void
    }

    fn get_val(&self, val_ptr: *const core::ffi::c_void) -> u64 {
        unsafe{ *(val_ptr as *const u64) }
    }
}

#[macro_export]
macro_rules! register_impl {
    ($name:ident) => {
        impl crate::Register for $name {
            fn id(&self) -> i32 {
                (*self).into()
            }
        }
    }
}

the trait will be used in reg_read/ write

and will be implement for a CPRegisterARM64 struct

Such impl using return a ref to the heap is not a proper way

An other more beautifull path i think is the used of a traint combined with any trait

use std::any::Any;
trait Register {
    fn id(&self) -> i32;
}

trait AnyRegister: Any+Register {}
impl<T: Any + Register> AnyRegister for T {}

#[macro_export]
macro_rules! register_impl {
    ($name:ident) => {
        impl crate::Register for $name {
            fn id(&self) -> i32 {
                (*self).into()
            }
        }
    }
}

In such way which will let us develop fucntion in a more regular way

    pub fn reg_read(&self, reg: &(dyn AnyRegister)) -> Result<u64, uc_error> // reg_no_ref
    {
        Ok(match reg.type_id() {
            #[cfg(feature="aarch64")]
            TypeId::of::<CPRegisterARM64> => { 
                let cp_reg:_CPRegisterARM64 = reg.downcast_ref_unchecked::<CPRegisterARM64>().clone().into();
                unsafe { ffi::uc_reg_read(self.get_handle(), RegisterARM64::CP_REG as i32, &mut cp_reg_id as *mut _ as *mut core::ffi::c_void) };
                    cp_reg_id
            }

            _ => {
                let mut value: u64 = 0;
                unsafe { ffi::uc_reg_read(self.get_handle(), reg.id(), &mut value as *mut u64 as _) };
                    value
            }
        })
    }

Let me know if you are interested in a way or another in such code

Have a good day

amaelFr avatar May 27 '25 18:05 amaelFr

Thanks for the suggestions and this is indeed a big limitation.

However, I dislike to have dyn AnyRegister and we should have reg_read<T: ...>(&self, reg: T) or so.

wtdcode avatar May 28 '25 02:05 wtdcode

Indeed it seems a prettier way

Is this way good for you ?

use std::any::Any;
trait Register: Any {
    fn id(&self) -> i32;
}

#[macro_export]
macro_rules! register_impl {
    ($name:ident) => {
        impl crate::Register for $name {
            fn id(&self) -> i32 {
                (*self).into()
            }

            // fn is_cp_reg(&self) -> bool {
            //     false
            // }
        }
    }
}

    pub fn reg_read<T: Register + ?Sized>(&self, reg: &T) -> Result<u64, uc_error> // reg_no_ref
    {
        Ok(match reg.type_id() {
            #[cfg(feature="aarch64")]
            TypeId::of::<CPRegisterARM64> => { 
                let cp_reg:_CPRegisterARM64 = reg.downcast_ref_unchecked::<CPRegisterARM64>().clone().into();
                unsafe { ffi::uc_reg_read(self.get_handle(), RegisterARM64::CP_REG as i32, &mut cp_reg_id as *mut _ as *mut core::ffi::c_void) };
                    cp_reg_id
            }

            _ => {
                let mut value: u64 = 0;
                unsafe { ffi::uc_reg_read(self.get_handle(), reg.id(), &mut value as *mut u64 as _) };
                    value
            }
        })
    }

    pub fn reg_write<T: Register + ?Sized>(&mut self, reg: &T, value: u64) -> Result<(), uc_error> {
        match reg.type_id() {
            #[cfg(feature="aarch64")]
            TypeId::of::<CPRegisterARM64> => { 
                let mut cp_reg:_CPRegisterARM64 = reg.downcast_ref_unchecked::<CPRegisterARM64>().clone().into();
                cp_reg.val = value;
                unsafe { ffi::uc_reg_write(self.get_handle(), RegisterARM64::CP_REG.id(), &mut cp_reg_id as *mut _ as *mut core::ffi::c_void) }
                    .into()
            }

            _ => {
                unsafe { ffi::uc_reg_write(self.get_handle(), reg.id(), &value as *const _ as _) }
                    .into()
            }
        }
    }

amaelFr avatar May 28 '25 05:05 amaelFr

:Any seems not really necessary

wtdcode avatar May 28 '25 06:05 wtdcode

Also it is possible to return an enum, right?

wtdcode avatar May 28 '25 06:05 wtdcode

Yes exactly, the enum this is probably the better way But this implied to add all register under an enum and pass the register through the enum every time.

Which will lead to .into()or Register::ARM64(RegisterARM::PC) almost everywhere

Nb: Enum will have the advantage to be returning directly and not from a box For example like in the function which returns the pc register based on the arch.

amaelFr avatar May 28 '25 06:05 amaelFr

Concerning the any trait I put it in order to downcast the ref for specific cp reg which required struct to be read / write

But yes this is not the good way to do it, but based on how it was, it was an implementation which limit the changes

amaelFr avatar May 28 '25 06:05 amaelFr

I think restricting the type of input to reg read/write functions to anything that is a "Register" makes sense, but I disagree with the enum return value and the idea of combining read/write to deal with the arm coprocessor registers. A reg_read_arm_coproc (and write, and arm64 variants) were implemented in the dev branch, and there's no other "special" registers like this one to my knowledge.

amaanq avatar May 28 '25 07:05 amaanq

I think restricting the type of input to reg read/write functions to anything that is a "Register" makes sense, but I disagree with the enum return value and the idea of combining read/write to deal with the arm coprocessor registers. A reg_read_arm_coproc (and write, and arm64 variants) were implemented in the dev branch, and there's no other "special" registers like this one to my knowledge.

Oh cool, I even overlooked that =).

Thanks for your work and that makes sense indeed.

wtdcode avatar May 28 '25 07:05 wtdcode

Read them in specific functions imply that read_batch functionality is lost for such Register?

amaelFr avatar May 28 '25 07:05 amaelFr

Another question, about the dev branch, the instruction hook why to make them different from x86 and arm64 for example?

Couldn't it be harmonised under a common Type (Like for the registers)

amaelFr avatar May 28 '25 08:05 amaelFr