gpt-2-output-dataset icon indicating copy to clipboard operation
gpt-2-output-dataset copied to clipboard

Simplified English often falsely classified as AI output

Open beltoforion opened this issue 2 years ago • 0 comments

This is more feedback than a bug report. So feel free to close or ignore the issue. It appears that there are lots of false positives for articles in the simplified wikipedia.

Examples:

  • https://simple.wikipedia.org/w/index.php?title=Magnetic_pendulum&oldid=5152730
  • https://simple.wikipedia.org/w/index.php?title=Northern_Territory&oldid=4898225

Both articles predate GPT. They cannot be AI generations yet the system is 99% sure. I found more examples. It appears that simplified english is classified as AI output with a relativly high probability.

beltoforion avatar Jan 09 '23 01:01 beltoforion