albertan017

Results 61 comments of albertan017

Yes, we are working on it. It could take a few months to fully develop the model. Specifically, we plan to integrate support for structures and classes, as commonly used...

We expect to release in around 3 months, we're trying our best on it! Since there isn't a C++ version of Exebench available, we need to compile the projects ourselves,...

Thank you for your interest in our project! Using deepseek-distill checkpoints: Absolutely, we plan to employ these checkpoints as starting points for fine-tuning our decompile model. The models we currently...

Yes, larger model (if properly trained) are always better. From our experience, if you only have small amount of data, chat/inference models are better than base model.

Yes, you can use Ghidra or IDA’s disassembly output directly—they’ll generate jump labels and even recover strings or variable values. We use objdump simply because it’s much simpler and around...

We use all the training samples: ``` train_synth_compilable train_real_compilable train_synth_simple_io train_real_simple_io train_synth_rich_io ``` And test on its test set: ``` test_synth ``` Good luck for your project!

Unfortunately, you will have to compile the dataset on your own as we do not have the authorization to distribute another's dataset. For more details on the issues we faced,...

you can find it [here](https://github.com/jordiae/exebench)

in the examples/basic.py, you can see ``` synth_wrapper = Wrapper(c_deps=row['synth_deps'] + '\n' + row['synth_io_pairs']['dummy_funcs'][0] + '\n', func_c_signature=row['func_head_types'].replace('extern', ''), func_assembly=row['asm']['code'][0], cpp_wrapper=row['synth_exe_wrapper']) ``` it requires the func_assembly. So we remove the func_assembly,...