badwords icon indicating copy to clipboard operation
badwords copied to clipboard

Combined bad words not being detected as profane.

Open KarnEdge opened this issue 6 years ago • 12 comments

filter.isProfane('fuckshit') // returns false 
filter.isProfane('fuck shit') // returns true

In lang.json, it has *fuck* and I was hoping that would mean anything around the word fuck would be caught like 'gofuckyourself', etc.

Of course, we only want to do this to certain words like fuck or shit that do not make up any other word in the dictionary: http://www.morewords.com/contains/fuck/ And, we avoid the Clbuttic mistake as well.

Any way to get it to work this way?

KarnEdge avatar Sep 06 '19 21:09 KarnEdge

+1 on this.

Also examples such as: "dick1", "ofuck" are not detected.

TautvydasDerzinskas avatar Oct 14 '19 09:10 TautvydasDerzinskas

To add to that, I tried using numbers and other types of combinations, but it didn't detect them either:

filter.isProfane('afuck') // returns false 
filter.isProfane('fuck1') // returns false

Bound3R avatar Dec 30 '19 01:12 Bound3R

I've added an answer to https://github.com/web-mech/badwords/issues/51

jimsideout avatar Jan 13 '20 23:01 jimsideout

I've added an answer to #51

This works for back-to-back blacklisted words, but not if you concatenate characters / clean words along with the bad one(s).

Any ideas?

x4iiiis avatar Feb 05 '20 10:02 x4iiiis

I've added an answer to #51

This works for back-to-back blacklisted words, but not if you concatenate characters / clean words along with the bad one(s).

Any ideas?

Can you give me an example?

jimsideout avatar Feb 05 '20 15:02 jimsideout

I've added an answer to #51

This works for back-to-back blacklisted words, but not if you concatenate characters / clean words along with the bad one(s). Any ideas?

Can you give me an example?

Cheers for the quick response!

I may have mis-implemented your solution, but an example would be 'f...face.' I realise that could just be added to the list but replacing 'face' with 'o' for example still bypasses it for me.

Interestingly though, 'f...head' gets blocked, despite not being on the list.

However, if you chain several F-bombs, or a mix of listed words you'll get the desired result.

Thanks again

x4iiiis avatar Feb 05 '20 16:02 x4iiiis

Cheers for the quick response!

I may have mis-implemented your solution, but an example would be 'f...face.' I realise that could just be added to the list but replacing 'face' with 'o' for example still bypasses it for me.

Interestingly though, 'f...head' gets blocked, despite not being on the list.

However, if you chain several F-bombs, or a mix of listed words you'll get the desired result.

Thanks again

My solution was to check on every key press, so I would never make it to the 'face' portion of 'f...face'. In my application I just throw a warning, then wipe out the 'f...' so they can never make it past it.

It sounds like you're potentially doing the check when editing ends or focus changes, is that right? To solve that I think we'd need to loop forward and backwards while dropping a character each time and check all character combinations in between. E.g. 'heyf...head' would need to eventually drop the first 3 characters as well as the last 4 to get the triggered word.

jimsideout avatar Feb 05 '20 16:02 jimsideout

Cheers for the quick response! I may have mis-implemented your solution, but an example would be 'f...face.' I realise that could just be added to the list but replacing 'face' with 'o' for example still bypasses it for me. Interestingly though, 'f...head' gets blocked, despite not being on the list. However, if you chain several F-bombs, or a mix of listed words you'll get the desired result. Thanks again

My solution was to check on every key press, so I would never make it to the 'face' portion of 'f...face'. In my application I just throw a warning, then wipe out the 'f...' so they can never make it past it.

It sounds like you're potentially doing the check when editing ends or focus changes, is that right? To solve that I think we'd need to loop forward and backwards while dropping a character each time and check all character combinations in between. E.g. 'heyf...head' would need to eventually drop the first 3 characters as well as the last 4 to get the triggered word.

Yeah, that makes sense. Thanks for that.

It is checking every keystroke, but it doesn't disallow the user from continuing to type beyond detected profanity. It just displays error text under the input box when the filter has caught something.

x4iiiis avatar Feb 05 '20 16:02 x4iiiis

Cheers for the quick response! I may have mis-implemented your solution, but an example would be 'f...face.' I realise that could just be added to the list but replacing 'face' with 'o' for example still bypasses it for me. Interestingly though, 'f...head' gets blocked, despite not being on the list. However, if you chain several F-bombs, or a mix of listed words you'll get the desired result. Thanks again

My solution was to check on every key press, so I would never make it to the 'face' portion of 'f...face'. In my application I just throw a warning, then wipe out the 'f...' so they can never make it past it.

It sounds like you're potentially doing the check when editing ends or focus changes, is that right? To solve that I think we'd need to loop forward and backwards while dropping a character each time and check all character combinations in between. E.g. 'heyf...head' would need to eventually drop the first 3 characters as well as the last 4 to get the triggered word.

Is this use case resolved? Where a string may be "testfuck" which is "false" when calling console.log( filter.isProfane("testfuck") === false )

Is it possible to use a String('testfuck').indexOf('fuck') for each bad word in the list of bad words in the filter.clean(argument) implementation ?

bernardbaker avatar Feb 21 '20 14:02 bernardbaker

This fixes your problem, you make this \\b${word.replace(/(\W)/g, '\\$1')}\\b to this \\b(\\w*${word.replace(/(\W)/g, '\\$1')}\\w*)\\b.

notice, that some words in the json make the regex fail like shit for example

isProfane(string) {
    return this.list
      .filter((word) => {
        const wordExp = new RegExp(`\\b(\\w*${word.replace(/(\W)/g, '\\$1')}\\w*)\\b`, 'gi');
        return !this.exclude.includes(word.toLowerCase()) && wordExp.test(string);
      })
      .length > 0 || false;
  }

keybraker avatar Dec 29 '20 16:12 keybraker

@bernardbaker important to consider the Scunthorpe problem when considering includes or indexOf

Jameskmonger avatar Jan 14 '21 18:01 Jameskmonger

Currently using patch-package to solve the problem

diff --git a/node_modules/bad-words/lib/badwords.js b/node_modules/bad-words/lib/badwords.js
index 3990c41..15de96e 100644
--- a/node_modules/bad-words/lib/badwords.js
+++ b/node_modules/bad-words/lib/badwords.js
@@ -31,11 +31,11 @@ class Filter {
    */
   isProfane(string) {
     return this.list
-      .filter((word) => {
-        const wordExp = new RegExp(`\\b${word.replace(/(\W)/g, '\\$1')}\\b`, 'gi');
-        return !this.exclude.includes(word.toLowerCase()) && wordExp.test(string);
-      })
-      .length > 0 || false;
+        .filter((word) => {
+          const wordExp = new RegExp(`\\b(\\w*${word.replace(/(\W)/g, '\\$1')}\\w*)\\b`, 'gi');
+          return !this.exclude.includes(word.toLowerCase()) && wordExp.test(string);
+        })
+        .length > 0 || false;
   }
 
   /**

lockieluke avatar Dec 21 '21 00:12 lockieluke