defects4j icon indicating copy to clipboard operation
defects4j copied to clipboard

Suggest minimizing the patch for Math-65

Open tolskais opened this issue 5 years ago • 4 comments

Hi,

Because defects4j aims to prune out unnecessary changes for each bug, I think the patch for Math-65 should be minimized by excluding the changes in getRMS(). I believe that the testcase triggered in the buggy commit can be passed by modifying only getChiSquare(). The description in the report for this defect like "Once corrected, getRMS() can even reduce ..." also supports that the changes in getRTS() can be considered as a refactoring action.

tolskais avatar May 11 '19 10:05 tolskais

Hi @tolskais,

Thanks for reporting this! Will will take a closer look at the patch.

Best, Rene

rjust avatar May 15 '19 19:05 rjust

Hi @tolskais,

Just to make sure we are on the same page, are you suggesting to replace the current patch for Math-65:

diff --git a/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java b/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
index 5a60da8..30ebfff 100644
--- a/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
+++ b/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
@@ -237,7 +237,12 @@ public abstract class AbstractLeastSquaresOptimizer implements DifferentiableMul
      * @return RMS value
      */
     public double getRMS() {
-        return Math.sqrt(getChiSquare() / rows);
+        double criterion = 0;
+        for (int i = 0; i < rows; ++i) {
+            final double residual = residuals[i];
+            criterion += residual * residual * residualsWeights[i];
+        }
+        return Math.sqrt(criterion / rows);
     }
 
     /**
@@ -250,7 +255,7 @@ public abstract class AbstractLeastSquaresOptimizer implements DifferentiableMul
         double chiSquare = 0;
         for (int i = 0; i < rows; ++i) {
             final double residual = residuals[i];
-            chiSquare += residual * residual * residualsWeights[i];
+            chiSquare += residual * residual / residualsWeights[i];
         }
         return chiSquare;
     }

with the following patch (which excludes all changes in the getRMS() method)

diff --git a/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java b/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
index 5a60da8..30ebfff 100644
--- a/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
+++ b/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
@@ -250,7 +255,7 @@ public abstract class AbstractLeastSquaresOptimizer implements DifferentiableMul
         double chiSquare = 0;
         for (int i = 0; i < rows; ++i) {
             final double residual = residuals[i];
-            chiSquare += residual * residual * residualsWeights[i];
+            chiSquare += residual * residual / residualsWeights[i];
         }
         return chiSquare;
     }

Although your proposed patch seems to be a valid minimal patch, it triggers more test cases than the original patch and the current patch in D4J. Thus, it cannot be considered a valid patch.

The current patch in D4J only triggers one test case (same as the original patch)

--- org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest::testCircleFitting
junit.framework.AssertionFailedError: expected:<0.004> but was:<0.0019737107108948474>
	....
	at org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest.testCircleFitting(LevenbergMarquardtOptimizerTest.java:442)

And the proposed patch triggers three test cases (two of them neither triggered with the original or minimal patch provided by D4J)

--- org.apache.commons.math.optimization.general.GaussNewtonOptimizerTest::testCircleFittingBadInit
junit.framework.AssertionFailedError: expected:<0.04268731682389561> but was:<0.021343658411947804>
	...
	at org.apache.commons.math.optimization.general.GaussNewtonOptimizerTest.testCircleFittingBadInit(GaussNewtonOptimizerTest.java:477)
--- org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest::testCircleFitting
junit.framework.AssertionFailedError: expected:<0.004> but was:<0.0019737107108948474>
	...
	at org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest.testCircleFitting(LevenbergMarquardtOptimizerTest.java:442)
--- org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest::testCircleFittingBadInit
junit.framework.AssertionFailedError: expected:<0.043> but was:<0.021343658411516093>
	...
	at org.apache.commons.math.optimization.general.LevenbergMarquardtOptimizerTest.testCircleFittingBadInit(LevenbergMarquardtOptimizerTest.java:494)

-- Best, Jose

jose avatar May 16 '19 18:05 jose

From your feedback, I realize that I misunderstood how defects4j reproduces the buggy commit.

I expect that defects4j generated the fixed commit from the buggy commit with the minimal fixes. For example, the original buggy commit in commons-math is like the following code.

     public double getRMS() {
        double criterion = 0;
        for (int i = 0; i < rows; ++i) {
            final double residual = residuals[i];
            criterion += residual * residual * residualsWeights[i];
        }
        return Math.sqrt(criterion / rows);
     }
 
     public double getChiSquare(...) {
         ....
         double chiSquare = 0;
         for (int i = 0; i < rows; ++i) {
             final double residual = residuals[i];
            chiSquare += residual * residual / residualsWeights[i];
         }
         return chiSquare;
     }

My expectation is that the above code can be fixed as below.

     public double getRMS() {
        double criterion = 0;
        for (int i = 0; i < rows; ++i) {
            final double residual = residuals[i];
            criterion += residual * residual * residualsWeights[i];
        }
        return Math.sqrt(criterion / rows);
     }
 
     public double getChiSquare(...) {
         ...
         double chiSquare = 0;
         for (int i = 0; i < rows; ++i) {
             final double residual = residuals[i];
            chiSquare += residual * residual * residualsWeights[i];
         }
         return chiSquare;
     }

the patch between two version is what I expected from the patch in defects4j, which you posted.

index 5a60da8..30ebfff 100644
--- a/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
+++ b/src/main/java/org/apache/commons/math/optimization/general/AbstractLeastSquaresOptimizer.java
@@ -250,7 +255,7 @@ public abstract class AbstractLeastSquaresOptimizer implements DifferentiableMul
         double chiSquare = 0;
         for (int i = 0; i < rows; ++i) {
             final double residual = residuals[i];
+            chiSquare += residual * residual * residualsWeights[i];
-            chiSquare += residual * residual / residualsWeights[i];
         }
         return chiSquare;
     }

Now the fixed commit by this patch is different compared to the fixed commit in defects4j, because defects4j removes the clone code inside getRMS. I proposed that the fixed commit in Math-65 should have the reverted version of getRMS that has the clone code as in the buggy commit

tolskais avatar May 17 '19 11:05 tolskais