LLM4Decompile
LLM4Decompile copied to clipboard
I wonder if you could share some experience on colllecting dataset
I'm trying to peft it. And I have got some dataset, but they either too small or having too many headers to install. The install commands of different headers differ greatly. So I wonder if you have any advice on how to find suitable datasets like AnghaBench. Thank you so much!
We've only found AnghaBench and Exebench, which cover nearly all available C libraries. If you have specific requirements, you might need to manually compile larger projects like Linux. While it's time-consuming, this approach can be beneficial for improving the model further, and that's what we're doing now.