profanity-check icon indicating copy to clipboard operation
profanity-check copied to clipboard

Doesn't understand context

Open hwsamuel opened this issue 4 years ago • 1 comments

The library seems to be working more like a dictionary look up for swear words. For example, it can correctly tag "fucking idiot" as negative, but also tags "fucking awesome!" as negative. Maybe the training set's features were uni-grams?

hwsamuel avatar Nov 27 '20 18:11 hwsamuel

From my point of view, that happens because of the learning algorithm the library uses. By tokenizing each word, "fucking" gets a huge probability of being profane, since it is profane in any context. For example, you cannot say "fucking awesome!" in a professional environment. If you place "fucking awesome!" in clean_data.csv, you will label as 1 (profane), not 0(not profane).

menkotoglou avatar Nov 30 '20 12:11 menkotoglou