defects4j icon indicating copy to clipboard operation
defects4j copied to clipboard

Simplify check out of non-minimized buggy version

Open rjust opened this issue 5 years ago • 7 comments

Currently, Defects4J's version ids have the following format: <id>(f|b), where b refers to the minimized buggy and f to the fixed version.

We should change the version id format to: <id>(f|b|b-min|b-orig)

  • b-min is minimized buggy
  • b-orig is non-minimized buggy
  • b is an alias for b-min (current behavior)

To be consistent, we could add the same suffixes for the (f)ixed version -- f-min would not be supported at this time.

rjust avatar Feb 18 '20 15:02 rjust

@mernst, @Greg4cr, @jose, any thoughts?

rjust avatar Feb 18 '20 15:02 rjust

This is consistent with what I said in the email thread. I'm in favor of this change.

@jose, you implemented some form of this in the "garbage-in, garbage-out" paper, right?

Greg4cr avatar Feb 18 '20 16:02 Greg4cr

This sounds great. Thanks for the suggestion.

mernst avatar Feb 18 '20 16:02 mernst

Agree with this. @rjust, do you want to make it obvious? I wonder if could just define it as f|b|bm|bo|? And, instead of min and orig, should we introduce the concept pure (i.e., minimal) vs impure (i.e., non-minimal) and defined it as f|b|b-pure|b-impure| or just f|b|bp|bi|?

@jose, you implemented some form of this in the "garbage-in, garbage-out" paper, right?

Yep. Once we all agree on the suffixes to use I would be happy to implement it.

jose avatar Feb 19 '20 18:02 jose

I am inclined to be more verbose to avoid confusion and mistakes.

I don't feel strongly about b-min vs. b-pure. I'd like to keep b-orig, though, to indicate that this is the original buggy version without any modifications to the source code.

Since we are already making b effectively an alias for backward-compatibility, adding bo as an alias for b-orig (and bm as an alias for b-min for consistency) might be acceptable?

rjust avatar Feb 20 '20 14:02 rjust

I'm fine with shorter aliases, but we should definitely have the longer forms for clarity.

After some more thought, I'd propose b-orig and b-min. The minimized form is always pure, but the original isn't always impure, so the pure vs impure terminology may not be universally accurate.

Greg4cr avatar Feb 20 '20 14:02 Greg4cr

I agree that "min" (for "minimized") is clearer than "pure" and I prefer it. Greg's point that not all orig are impure is an even better argument.

I would be slightly inclined to omit the short aliases bo and bm just because shorter versions are more obscure, it's easier to make typos, it's easier to overlook typos, and the documentation has to be longer to explain it. I don't feel extremely strongly about this, though.

mernst avatar Feb 20 '20 15:02 mernst