defects4j
defects4j copied to clipboard
javadoc/comments changes should be part of the patch?
Hi @rjust and team,
I noticed the following: some d4j bugs do not reflect the changes that have been made to javadoc comments while fixing the bug.
Examples:
-
Closure-149 The patch applied to src/com/google/javascript/jscomp/CompilerOptions.java adds comments: i) at line 586 emphasizing the change at line 588 ii) a javadoc comment starting at line 957 for the newly added method at line 960
In d4j, these comments are kept as part of the buggy file... which is not accurate.
-
Several cases similar to the above (ii) at bugs: Closure-156, Closure-165, Codec-14, Jsoup-3, Jsoup-56, Jsoup-92, Lang-46, Mockito-10, and Mockito-25.
Now I know that this might be controversial... given the criteria applied to minimize the d4j patches. That said, I'd argue that for these cases, the comment changes accompanied the code change (i.e. the fix is not a mere comment change) and indeed the code change and comment change are tightly coupled.
I'd also point out that these cases could (and actually do) mislead both:
- Test generation techniques that try to extract oracles from documentation (e.g. toradocu for randoop), and
- Static bug checkers that look for inconsistencies between java methods and their javadoc comments (e.g. InvalidParam and InvalidThrows patterns in ErrorProne).
Therefore, I believe we should have these kind of comments as part of the bug-fixing patches. Let me know what do you think.
Regards,
Thanks for the suggestion. I agree with you: it would be nice to include exactly the relevant comments in the bug-fixing patch. Thank you especially for pointing out some specific tools that use the documentation, showing that this is not merely a hypothetical problem!
Here are some issues with making such a change:
- This conflicts with the current documented method for creating the patches. It would require changing both the documentation and all the patches.
- Inclusion of the comments would be subjective, whereas the current criteria are objective. Do you see a way to make the guidelines objective?
The former is a large task. The current diffs represent literally person-years of work. Although we have since made the process semi-automated, it is still a lot of work. I could see semi-automating local comments, like the ones you noted.
The latter is a concern given Defects4J's high priority on consistency and correctness.
Would you be willing to take on this task?
Thanks again.