yara-x icon indicating copy to clipboard operation
yara-x copied to clipboard

Expose yr dump via language bindings

Open schrodyn opened this issue 1 year ago • 7 comments

Would like to see the same information provided by yr dump exposed via language bindings so I can work with the output in Golang.

schrodyn avatar Jan 02 '25 19:01 schrodyn

I'm actually taking a look at this one now and realized you can do this (I've only checked the python bindings) by using ScanResults.module_outputs. It does require that you actually compile a rule that uses the module and perform a scan with it, which doesn't really line up with the dump command but I think it makes sense to have a Module class that can expose this.

The rough python for it would be something like:

class Module:
    def __init__(module: str) -> Self:
        self.module = module

    def parse(data: bytes) -> dict:
        r"""Parse data with module, returning results in a dictionary."""

You could use it with something like this:

yara_x.Module('pe').scan(data)

I haven't looked into what the other language bindings offer or might look like if there is a decision to expose the functionality via this or a similar method.

wxsBSD avatar Apr 06 '25 17:04 wxsBSD

I'll give that a try. I was trying to access it via Go so I'll see if it's available there too. Compiling a rule will just match my original yara-og workflow where I use this rule:

import "pe"
import "elf"

rule dump {
  condition: 1
}

schrodyn avatar Apr 06 '25 21:04 schrodyn

I'll give that a try. I was trying to access it via Go so I'll see if it's available there too. Compiling a rule will just match my original yara-og workflow where I use this rule:

import "pe"
import "elf"

rule dump {
  condition: 1
}

I think having a clear, well defined interface instead of requiring this unintuitive way is ideal. It isn't hard to get a prototype together and see what Victor thinks. I'll get something up for discussion soon.

wxsBSD avatar Apr 06 '25 22:04 wxsBSD

I think we should introduce some Python and Golang API that returns the output of a module without having to use a dummy rule, similarly to https://docs.rs/yara-x/0.14.0/yara_x/mods/fn.invoke.html and https://docs.rs/yara-x/0.14.0/yara_x/mods/fn.invoke_all.html. In the case of Python the output of this API should probably be a dictionary (the dictionary resulting of converting the module's output protobuf to JSON).

In the case of Golang its a bit more complicated. First, we need to implement the C API, which should return the module's output as a JSON-formatted string. And then implement the Golang API on top of that.

plusvic avatar Apr 08 '25 09:04 plusvic

I know you mentioned go bindings specifically but that's going to require a bit more work to do. From a quick glance the go bindings are wrappers around the C bindings which do not expose the module invoke API, so I'd have to build it into the C bindings and then the go bindings.

Can the python bindings that we now have be good enough or do you really need this in go?

wxsBSD avatar May 20 '25 19:05 wxsBSD

I know you mentioned go bindings specifically but that's going to require a bit more work to do. From a quick glance the go bindings are wrappers around the C bindings which do not expose the module invoke API, so I'd have to build it into the C bindings and then the go bindings.

Can the python bindings that we now have be good enough or do you really need this in go?

Not really, having this in the Golang API is desirable for the sake API feature parity, but it is not really a requirement.

plusvic avatar May 20 '25 19:05 plusvic

I can accept just python for now. Maybe keep Go on the todo scroll for the future. Since the Go API is considered a first class citizen for yara-x I'd love to see this included. Thanks.

schrodyn avatar May 20 '25 20:05 schrodyn