uBlockOrigin-HUGE-AI-Blocklist icon indicating copy to clipboard operation
uBlockOrigin-HUGE-AI-Blocklist copied to clipboard

Add entomologist.net

Open mc776 opened this issue 11 months ago • 3 comments

Includes gems like an article titled "How Quickly Can A Silverfish Bury Itself In Your Flesh?" with things like "Silverfish damage is caused by their love for carbohydrates, which does not include human blood."

The rest of the site is similarly obvious slop but I don't know if it's that obvious to people who don't read about bugs much.

mc776 avatar Jan 25 '25 17:01 mc776

Hey! I'm looking at it and this website is clearly a content farm (along with it's blog subdomain). I'm not a bug expert by any means (although they are pretty fascinating), but reading some of the articles definitely seemed llm generated. Doubly so when you realize that you can't even click on the author's name, and that articles are being pumped out multiple times per day, all by the same person. But the problem arises that this particular repo is more geared towards AI generated images, and even though this website is completely worthless and a SEO farm machine, from what I'm seeing the images on there are real images of bugs. I haven't decided on what to do when these particular scenarios pop up in this repo, but for the time being I'm in the process of making a content farm/llm repo that's sole purpose is getting rid of these types of websites w/out the worry of images getting in the way.

laylavish avatar Feb 13 '25 04:02 laylavish

The internet is turning into a gigantic dump for AI crap. Here are some more examples:

  • https://thetechylife.com/
  • https://www.thegeekdiary.com/
  • https://www.southparkcycles.com/
  • https://www.clrn.org/
  • https://commandmasters.com/
  • https://audiochamps.com/
  • https://audiophiles.co/

step 1: Flood everything with low effort and low quality shit step 2: Be hurt because people don't acknowledge you as a writer/artist/musician

AI bros. Go Figure.

sezanzeb avatar Apr 27 '25 10:04 sezanzeb

At this rate, I'd rather just use a search engine that only crawls

  • wikipedia
  • websites that wikipedia links to
  • github repos with more than 10 stars
  • websites that github repos with more than 10 stars link to

You can get all the important sources of information by doing so already: askubuntu.com, documentation, company websites, government websites, news sites, etc. Most websites of actual relevance are probably mentioned somewhere on Wikipedia.

sezanzeb avatar Apr 27 '25 11:04 sezanzeb