Regarding #792

Aug 01 '19 06:08 chidauri

After this change 69 bugs have no steps to reproduce 2563 bugs have steps to reproduce X: (2632, 67831), y: (2632,) Cross Validation scores: Accuracy: f0.744493313309018 (+/- 0.02761428505128297) Precision: f0.9914028351617405 (+/- 0.007963866807440885) Recall: f0.7444676075912519 (+/- 0.03025468155899755) X_train: (118, 67831), y_train: (118,) X_test: (264, 67831), y_test: (264,) Test Set scores: No confidence threshold - 264 classified pre rec spe f1 geo iba sup

      1       0.99      0.77      0.80      0.86      0.78      0.61       254
      0       0.12      0.80      0.77      0.21      0.78      0.62        10

avg / total 0.96 0.77 0.80 0.84 0.78 0.61 264

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 195 │ 59 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 2 │ 8 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.6 - 223 classified pre rec spe f1 geo iba sup

      1       0.99      0.81      0.78      0.89      0.80      0.63       214
      0       0.15      0.78      0.81      0.25      0.80      0.63         9

avg / total 0.95 0.81 0.78 0.87 0.80 0.63 223

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 174 │ 40 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 2 │ 7 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.7 - 182 classified pre rec spe f1 geo iba sup

      1       0.99      0.84      0.78      0.91      0.81      0.66       173
      0       0.21      0.78      0.84      0.33      0.81      0.65         9

avg / total 0.95 0.84 0.78 0.88 0.81 0.66 182

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 146 │ 27 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 2 │ 7 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.8 - 147 classified pre rec spe f1 geo iba sup

      1       0.98      0.89      0.78      0.94      0.83      0.70       138
      0       0.32      0.78      0.89      0.45      0.83      0.69         9

avg / total 0.94 0.88 0.78 0.91 0.83 0.70 147

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 123 │ 15 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 2 │ 7 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.9 - 81 classified pre rec spe f1 geo iba sup

      1       0.99      0.91      0.83      0.94      0.87      0.76        75
      0       0.42      0.83      0.91      0.56      0.87      0.75         6

avg / total 0.94 0.90 0.84 0.92 0.87 0.76 81

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 68 │ 7 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 1 │ 5 │ ╘════════════╧═════════════════╧═════════════════╛

Before this change

73 bugs have no steps to reproduce 2642 bugs have steps to reproduce X: (2715, 68671), y: (2715,) Cross Validation scores: Accuracy: f0.6966856048676635 (+/- 0.09750364886404748) Precision: f0.9890742501710503 (+/- 0.01088263716211331) Recall: f0.6965130473241927 (+/- 0.10856018472515121) X_train: (128, 68671), y_train: (128,) X_test: (272, 68671), y_test: (272,) Test Set scores: No confidence threshold - 272 classified pre rec spe f1 geo iba sup

      1       0.97      0.70      0.44      0.82      0.56      0.32       263
      0       0.05      0.44      0.70      0.09      0.56      0.30         9

avg / total 0.94 0.69 0.45 0.79 0.56 0.32 272

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 185 │ 78 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 5 │ 4 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.6 - 228 classified pre rec spe f1 geo iba sup

      1       0.98      0.72      0.43      0.83      0.56      0.32       221
      0       0.05      0.43      0.72      0.08      0.56      0.30         7

avg / total 0.95 0.71 0.44 0.81 0.56 0.32 228

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 159 │ 62 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 4 │ 3 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.7 - 178 classified pre rec spe f1 geo iba sup

      1       0.97      0.78      0.33      0.87      0.51      0.27       172
      0       0.05      0.33      0.78      0.09      0.51      0.25         6

avg / total 0.94 0.77 0.35 0.84 0.51 0.27 178

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 135 │ 37 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 4 │ 2 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.8 - 119 classified pre rec spe f1 geo iba sup

      1       0.98      0.83      0.50      0.90      0.64      0.43       115
      0       0.09      0.50      0.83      0.15      0.64      0.40         4

avg / total 0.95 0.82 0.51 0.87 0.64 0.43 119

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 95 │ 20 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 2 │ 2 │ ╘════════════╧═════════════════╧═════════════════╛

Confidence threshold > 0.9 - 68 classified pre rec spe f1 geo iba sup

      1       0.98      0.88      0.50      0.93      0.66      0.46        66
      0       0.11      0.50      0.88      0.18      0.66      0.42         2

avg / total 0.96 0.87 0.51 0.91 0.66 0.46 68

╒════════════╤═════════════════╤═════════════════╕ │ │ 1 (Predicted) │ 0 (Predicted) │ ╞════════════╪═════════════════╪═════════════════╡ │ 1 (Actual) │ 58 │ 8 │ ├────────────┼─────────────────┼─────────────────┤ │ 0 (Actual) │ 1 │ 1 │ ╘════════════╧═════════════════╧═════════════════╛

Aug 01 '19 06:08 chidauri

Could you show here the confusion matrices too?

Aug 01 '19 08:08 marco-c

Codecov Report

Merging #817 into master will increase coverage by 0.69%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #817      +/-   ##
==========================================
+ Coverage   52.83%   53.52%   +0.69%     
==========================================
  Files          75       75              
  Lines        5127     5085      -42     
==========================================
+ Hits         2709     2722      +13     
+ Misses       2418     2363      -55

Impacted Files	Coverage Δ
bugbug/models/stepstoreproduce.py	`67.39% <100%> (+1.48%)`	:arrow_up:
scripts/regressor_finder.py	`0% <0%> (ø)`	:arrow_up:
bugbug/similarity.py	`0% <0%> (ø)`	:arrow_up:
scripts/evaluate_similarity.py	`0% <0%> (ø)`	:arrow_up:
scripts/commit_classifier.py	`0% <0%> (ø)`	:arrow_up:
scripts/microannotate_generator.py	`0% <0%> (ø)`	:arrow_up:
bugbug/repository.py	`74.73% <0%> (+0.35%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 001fba1...f184858. Read the comment docs.

Aug 01 '19 09:08 codecov-io

Could you show here the confusion matrices too?

The before ones too.

Aug 01 '19 16:08 marco-c

BTW, there isn't much change, but this could make a difference when we have PULearning, so I'll keep it open and we'll re-evaluate after that.

Aug 01 '19 16:08 marco-c

Could you show here the confusion matrices too?

The before ones too.

Done

Aug 01 '19 18:08 chidauri

bugbug
bugbug copied to clipboard

Restrict training set of stepstoreproduce model only to defects

Codecov Report

bugbug bugbug copied to clipboard

Restrict training set of stepstoreproduce model only to defects

Codecov Report

bugbug
bugbug copied to clipboard