An-Amharic-News-Text-classification-Dataset icon indicating copy to clipboard operation
An-Amharic-News-Text-classification-Dataset copied to clipboard

Improve Accuracy of The Model

Open meuzgebre opened this issue 2 years ago • 3 comments

I have run your code you have implemented naive_bayes.GaussianNB estimator resulted with 0.6 accuracy. It is a suitable estimator for text data. However, since you converted the text values in to a numeric form you can use other estimators such as LogisticRegression or Linear SVM for a better accuracy. I have added LogesticRegression model to your code with out any other modification and I have got around 0.79 accuracy. In addition, you can improve the accuracy by:

  • Removing nan and null values from the dataset.
  • For feature extraction, use only headline, category and article as your data as the rest of the columns are not necessary.

meuzgebre avatar Aug 23 '22 07:08 meuzgebre

Good job @meuzgebre , we released this dataset so many people would work towards improving this accuracy. we would like to see this result you talked about become the SOTA algorithm for this dataset here.

if you have some writeup and updated code we are happy to mention it in this readme.

IsraelAbebe avatar Aug 23 '22 12:08 IsraelAbebe

Hey @IsraelAbebe checkout my pull requests.

meuzgebre avatar Sep 02 '22 18:09 meuzgebre

@meuzgebre can you send a pull request to the new branch I created for you , I would like to put it there and edit the readme.

IsraelAbebe avatar Sep 12 '22 15:09 IsraelAbebe