gpt-2-output-dataset
gpt-2-output-dataset copied to clipboard
Simplified English often falsely classified as AI output
This is more feedback than a bug report. So feel free to close or ignore the issue. It appears that there are lots of false positives for articles in the simplified wikipedia.
Examples:
- https://simple.wikipedia.org/w/index.php?title=Magnetic_pendulum&oldid=5152730
- https://simple.wikipedia.org/w/index.php?title=Northern_Territory&oldid=4898225
Both articles predate GPT. They cannot be AI generations yet the system is 99% sure. I found more examples. It appears that simplified english is classified as AI output with a relativly high probability.