bugbug icon indicating copy to clipboard operation
bugbug copied to clipboard

get_bugbug_labels no longer adds nobug type to regression training data

Open avinashselvam opened this issue 3 years ago • 2 comments

#539

Modified get_bugbug_labels in defect.py to include only those data points that are labelled either regression or bug_no_regression in the training set.

Training the model without changes

72486 non-regression bugs

Cross Validation scores: Accuracy: f0.9731263445549161 (+/- 0.0012810455820845609) Precision: f0.9560802008310938 (+/- 0.006503421458310747) Recall: f0.9316432362619518 (+/- 0.0042866900183067425)

Training the model after changes

71597 non-regression bugs (889 dropped)

Cross Validation scores: Accuracy: f0.9739072259525028 (+/- 0.0019480324611321944) Precision: f0.9561803892880535 (+/- 0.006928496874119621) Recall: f0.9358629670750973 (+/- 0.0045683573571298)

Minor improvement in precision and recall.

Should categories task, enhancement, feature also be removed from the training data for regression?

Please let me know if I have misunderstood the task.

avinashselvam avatar Mar 29 '23 13:03 avinashselvam

@avinashselvam this is part of the request from #539. The other part is not to consider bugs with type "enhancement" or "task" as label 0 for the regression model.

marco-c avatar Mar 31 '23 14:03 marco-c

@avinashselvam are you still interested in working on this? If so, I will be glad to help.

suhaibmujahid avatar Apr 20 '23 14:04 suhaibmujahid