LLM4Decompile icon indicating copy to clipboard operation
LLM4Decompile copied to clipboard

cannot reproduce the results, is there anything wrong?

Open QiuJYWX opened this issue 1 year ago • 5 comments

#!/bin/bash

CUDA_VISIBLE_DEVICES=0,1 python ../evaluation/run_evaluation_llm4decompile_vllm.py
--model_path ../../LLM/llm4decompile-6.7b-v1.5
--testset_path ../decompile-eval/decompile-eval.json
--gpus 2
--max_total_tokens 2048
--max_new_tokens 2000
--repeat 1
--num_workers 32
--gpu_memory_utilization 0.82
--temperature 0

Optimization O0: Compile Rate: 0.9268, Run Rate: 0.5488 Optimization O1: Compile Rate: 0.9268, Run Rate: 0.3598 Optimization O2: Compile Rate: 0.8902, Run Rate: 0.3537 Optimization O3: Compile Rate: 0.8902, Run Rate: 0.3171

QiuJYWX avatar May 16 '24 07:05 QiuJYWX

Thanks for testing the code. Please use the decompile-eval-executable-gcc-obj.json. All the evaluations and models are based on executable, which is different from our previous setting (object file, not linked).

Updates

  • [2023-05-16]: Please use decompile-eval-executable-gcc-obj.json. The source codes are compiled into executable binaries and disassembled into assembly instructions.

albertan017 avatar May 16 '24 07:05 albertan017

Thx for the reply, will try again.

QiuJYWX avatar May 17 '24 02:05 QiuJYWX

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

QiuJYWX avatar Jun 21 '24 03:06 QiuJYWX

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Yes, deepseek-coder-v2 demonstrates a strong ability to decompile binaries, achieving decompilation results comparable to those of GPT-4o (avg 15% on HumanEval-Decompile). Our efforts are ongoing for the llm4decompile-ref (which achieves much better results compared to directly decompile). While we are not working with the 236B version, it is far beyond our budget.

albertan017 avatar Jun 21 '24 03:06 albertan017

Hi @albertan017 ,

Thanks for the new release. Since DeepSeek has released more powerful DeepSeek-Coder-V2(16B and 236B), it is expected to achieve better decompile results. If you have enough GPU budget, it is expected it that you can use DeepSeek-Coder-V2 as base model to further improve LLM4Decompile.

Can you tell me the deepseek-coder-v2 have focused on the different compilation optimization level. This should be the shining point of this work. Thx.

Cheliosoops avatar Jul 05 '24 10:07 Cheliosoops