smartbugs-curated
smartbugs-curated copied to clipboard
excluding duplicated contracts in SB curated dataset
In SB curated dataset, there are duplicated contracts which are not meaningfully different.
For meaningful comparisons among tools, I think those duplicated contracts need to be excluded from SB curated dataset.
Of course, I understand smart contracts deployed can be very similar. However, I think contracts with only minor syntactic differences need to be deduplicated.
For example, in access_control
category, the only difference, that I found among the below three contracts, is the name of vulnerable functions:
https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name1.sol
https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name2.sol
https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name3.sol
Thanks for opening this issue, @SunBeomSo.
It's a good point. We have included them as three separate contracts because we tried to be faithful to what we found in our sources. We might have to revise this in a future release of Smartbugs, but we'll need to use well-defined criteria.
Have you found any other similar/duplicated contracts?
@jff Thanks for the answer.
I would report here when I find more similar/duplicated ones.