yara-x icon indicating copy to clipboard operation
yara-x copied to clipboard

`compiler.add_source()` panics with "index out of bounds" in cranelift-entity

Open gowtham-cyber-max opened this issue 6 months ago • 4 comments

I'm experiencing a panic when calling compiler.add_source() in the YARA-X Rust bindings. The error occurs in the Cranelift compilation backend, not in my application code.

The panic message is:

thread '' panicked at C:\Users[username].cargo\registry\src\index.crates.io-1949cf8c6b5b557f\cranelift-entity-0.121.2\src\list.rs:577:26: index out of bounds: the len is 0 but the index is 4 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

This happens when I try to compile a 200 MB .yar file. It may be related to a memory issue, as the process consumes around 7 GB of RAM.

gowtham-cyber-max avatar Sep 09 '25 13:09 gowtham-cyber-max

Can you provide more information about that .yar file? For instance, how many rules the file contains? How large is the largest condition?

plusvic avatar Sep 09 '25 19:09 plusvic

gowtham_yara_x_test.txt File Statistics:

File Statistics:

Largest single rule: 1,498,156 characters

Total rules: 87,283 rules

Updated Error Details: Single rule now produces panic:

text thread '' panicked at User\cranelift-codegen-0.121.2\src\ir\instructions.rs:228:9: assertion failed: payload < (1 << 30)

Clarification: This is a stress test to evaluate YARA-X limits - we intentionally created a rule with numerous conditions to test the engine. Please don't consider this a typical production scenario.

Memory Issue: YARA-X consumes ~7GB RAM before panic, while standard YARA handles the same file successfully. Even with relax mode enabled, the memory consumption remains problematic and incompatible.

The excessive memory consumption suggests YARA-X has significantly higher memory overhead when processing large rulesets compared to standard YARA's more memory-efficient approaches.

Question: Why does YARA handle this successfully while YARA-X fails at the compilation stage with such dramatically different memory requirements?

Note: I've attached one sample rule for reproduction in your system, but removed the metadata for privacy - please populate with your own standard metadata fields (author, date, description, reference).

gowtham-cyber-max avatar Sep 10 '25 11:09 gowtham-cyber-max

The issue here is probably related to the size of the condition's code. YARA-X compiles the condition expression to webassembly code, which is later converted to native code by cranelift.

Question: Why does YARA handle this successfully while YARA-X fails at the compilation stage with such dramatically different memory requirements?

YARA compiles conditions to its own instruction set that is emulated by its virtual machine. It doesn't try to compile the code to native code, which is a simpler but slower approach. YARA-X in the other hand is more complex in that regard but the produced code is faster.

plusvic avatar Sep 10 '25 13:09 plusvic

I've confirmed that cause of this issue is the large size of the WASM function that is produced by such a large condition. The panic looks like a bug in wasmtime, if I disable code optimizations with cranelift_opt_level(wasmtime::OptLevel::None), wasmtime doesn't panic anymore and produces an error like Compilation error: Code for function is too large.

https://github.com/bytecodealliance/wasmtime/issues/11682

plusvic avatar Sep 11 '25 08:09 plusvic