LLM4Decompile icon indicating copy to clipboard operation
LLM4Decompile copied to clipboard

Where is the decompile-eval-executable-gcc-ghidra.json

Open Sen-Ran opened this issue 1 month ago • 1 comments

Thank you for your fantastic work! we noticed that in your colab scripts there is a /LLM4Decompile/decompile-eval/decompile-eval-executable-gcc-ghidra.json file. But we didnot find this file in the project, can you provide it?

Sen-Ran avatar Nov 27 '25 01:11 Sen-Ran

I find the file in the LLM4Decompile/legacy-test/! by the way, we found that in the LLM4Decompile-Ref part of the colab script, the prompt is before = "# This is the assembly code:\n" after = "\n# What is the source code?\n". We wonder why isn't the prompt "# This is the pseudocode:\n" since LLM4Decompile-Ref takes the ghidra'pseudocode as input

Sen-Ran avatar Nov 27 '25 02:11 Sen-Ran

Thank you for the question. We marked this as a legacy test because we have integrated the dataset into decompile-bench (https://huggingface.co/datasets/LLM4Binary/decompile-eval/tree/main), which now includes the HumanEval, MBPP, and Github2025 datasets.

Regarding the prompt, that is a legacy artifact. However, since we used this exact prompt structure during model training, the specific wording is less critical than the consistency between training and inference

albertan017 avatar Nov 28 '25 01:11 albertan017