Benjamin Mah

Results 12 issues of Benjamin Mah

Script to generate dataset of bug-inducing commits, backout commits, and the subsequent fix commit. Intended to include: - The hashes of the three commits. - Metadata of each commit (`pushdate`,...

Addressing #4172. Creation of a model to classify a Fenix bug into the correct Fenix component.

Resolves #4297. Includes the initial manually classified product and component of a bug as features.

Script to generate dataset of a patch, its inline comments, and the fix patch. Intended to include: - bug ID - revision ID - initial patch ID - fix patch...

Currently the component model does not include initial product and component that the bug was manually classified as. We could experiment with including this feature to see if it can...

Resolves #4269. Introduces new feature that uses file paths mentioned in the title and description of a bug and splits it into sub-paths and individual directories/files.

Resolves #4281. Investigating and adding sampling techniques (i.e. SMOTE, SMOTEEN, RandomUndersampling) to address the imbalanced dataset of bugs.

The current training set does not utilize any sampling techniques (e.g., oversampling, undersampling, SMOTE) to address the imbalance in the dataset. We could implement appropriate sampling techniques to balance the...

Preliminary code generation script using models from OpenAI and Anthropic. Uses a local Qdrant vector DB for data point storage for similar commits to be injected into the single-shot and...

Once #4407 is resolved, we can focus on including more bugs in the training set for the Fenix component model. Ideally, once we train the Fenix component model on GeckoView...