smartbugs-curated icon indicating copy to clipboard operation
smartbugs-curated copied to clipboard

excluding duplicated contracts in SB curated dataset

Open sunbeomso opened this issue 4 years ago • 2 comments

In SB curated dataset, there are duplicated contracts which are not meaningfully different.

For meaningful comparisons among tools, I think those duplicated contracts need to be excluded from SB curated dataset.

Of course, I understand smart contracts deployed can be very similar. However, I think contracts with only minor syntactic differences need to be deduplicated.

For example, in access_control category, the only difference, that I found among the below three contracts, is the name of vulnerable functions: https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name1.sol https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name2.sol https://github.com/smartbugs/smartbugs/blob/master/dataset/access_control/incorrect_constructor_name3.sol

sunbeomso avatar Jul 15 '20 06:07 sunbeomso

Thanks for opening this issue, @SunBeomSo.

It's a good point. We have included them as three separate contracts because we tried to be faithful to what we found in our sources. We might have to revise this in a future release of Smartbugs, but we'll need to use well-defined criteria.

Have you found any other similar/duplicated contracts?

jff avatar Jul 24 '20 15:07 jff

@jff Thanks for the answer.

I would report here when I find more similar/duplicated ones.

sunbeomso avatar Jul 27 '20 12:07 sunbeomso