LLM4Decompile cannot reproduce the results, is there anything wrong?

#!/bin/bash

CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py
--model_path ../../LLM/llm4decompile-6.7b-v1.5
--testset_path ../decompile-eval/decompile-eval.json
--gpus 2
--max_total_tokens 2048
--max_new_tokens 2000
--repeat 1
--num_workers 32
--gpu_memory_utilization 0.82
--temperature 0

Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488 Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598 Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537 Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171

May 16 '24 07:05 QiuJYWX

Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).

Updates

[2023-05-16]: Please use decompile-eval-executable-gcc-obj.json. The source codes are compiled into executable binaries and disassembled into assembly instructions.

May 16 '24 07:05 albertan017

Thx for the reply, will try again.

May 17 '24 02:05 QiuJYWX

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Jun 21 '24 03:06 QiuJYWX

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.

Jun 21 '24 03:06 albertan017

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Can you tell me the deepseek-coder-v2 have focused on the different compilation optimization level. This should be the shining point of this work. Thx.

Jul 05 '24 10:07 Cheliosoops