cannot reproduce the results, is there anything wrong?
#!/bin/bash
CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py
--model_path ../../LLM/llm4decompile-6.7b-v1.5
--testset_path ../decompile-eval/decompile-eval.json
--gpus 2
--max_total_tokens 2048
--max_new_tokens 2000
--repeat 1
--num_workers 32
--gpu_memory_utilization 0.82
--temperature 0
Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488 Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598 Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537 Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171
Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).
Updates
- [2023-05-16]: Please use
decompile-eval-executable-gcc-obj.json. The source codes are compiled into executable binaries and disassembled into assembly instructions.
Thx for the reply, will try again.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.
Hi @albertan017 ,
Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.
Can you tell me the deepseek-coder-v2 have focused on the different compilation optimization level. This should be the shining point of this work. Thx.