Harshit Chittora

Results 22 comments of Harshit Chittora

@marco-c **With BaggingClassifier, Undersampler turned off** Cross Validation scores: Accuracy: f0.9862444146820298 (+/- 0.0018937457011950216) Precision: f0.7003269062965275 (+/- 0.06540289338010274) Recall: f0.6149242215486359 (+/- 0.04031561239639824) X_train: (45000, 570310), y_train: (45000,) X_test: (5000, 570310), y_test:...

@marco-c **After the latest changes, it performs a lot better now** 48965 bugs have no labels 27 bugs have no steps to reproduce 1008 bugs have steps to reproduce X:...

@ayush-1506 @marco-c I am only able to run it for 7000 bugs( on colab) , if i go more, memory overflow occurs :( **For 7000 bugs:** ``` 6300/6300 [==============================] -...

> I'm no expert, but an accuracy of 0.97 sounds too good to be true. yeah it is probably overfitting(probably due to such a small training set ), so first...

No confidence threshold - 12463 classified for product model ``` pre rec spe f1 geo iba sup Core 0.71 0.97 0.46 0.82 0.66 0.46 7257 DevTools 0.87 0.57 0.99 0.69...

@marco-c we cannot use classifier chain here , as it is used to combine binary classifiers which is not our case i think as written here https://scikit-learn.org/stable/modules/multiclass.html#classifier-chain

@marco-c **The results of the current model for no threshold case..** ```No confidence threshold - 12431 classified /usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in...

> These lines should not be there, as these components should be ignored: > > > ``` > > Untriaged 0.39 0.18 1.00 0.25 0.42 0.16 78 > > ```...

> I think you might be rollbacking all bugs this way, are you sure you aren't? https://github.com/mozilla/bugbug/blob/b456bd3b2d00aad453b42cc8761fc04fbf0e3866/bugbug/bug_features.py#L444-L449 This helped :) PS. This is why we yield the bugs (which we...

**Before changes** >73 bugs have no steps to reproduce 2642 bugs have steps to reproduce X: (2715, 68672), y: (2715,) Cross Validation scores: Accuracy: f0.6930046232725715 (+/- 0.08506114027939689) Precision: f0.9889862871434468 (+/-...