BigCloneEval
BigCloneEval copied to clipboard
How to evaluate parse tree based tool on IJADataset
To detect clones, we convert java codes to parse tree, then calculate the similarity of two parse trees to check whether they are clones or what. BigCloneBench gives an error, kindly help how can we convert IJAdataset to parse tree. We are converting java code to parse tree using ANTLR grammar, it needs the main function in java code to convert into a parse tree. (IJA Dataset contains java files without main function).
KIndly suggest how to go ahead to evaluate our work on BigCloneBench
IJaDataset contains a collection of source files scraped from open-source online sources (original work: https://sites.google.com/site/asegsecold/projects/seclone). I am not sure if it is possible to reconstruct the original software systems and locate a main function for each of these. If this is a requirement of your tool, it may be challenging.
Is it actually necessary to start at a main function? I would think that java files should be individually parseable, but I don't have experience with ANTLR for creating abstract syntax trees.
I used ANTLR with my master thesis and it worked just fine. You can have a look at my grammar if you're interested. I think I borrowed it from over here, I'm not sure though. Plus I stripped down the grammar for performance, so you won't get a complete parse tree from it.