[CI] Add AI-generated content detection workflow
To improve the contributor workflow and avoid time-intensive reviews, this issue proposes adding a CI workflow that flags PRs with likely AI-generated content before they reach full review.
See this example from Hugo's repository: https://github.com/gohugoio/hugo/pull/14205#issuecomment-3563875679
Looking at Hugo's repo, the aiwatchdog.yml workflow uses the bep/ai-watchdog action, which analyzes the PR content (description + changed files) and sends them to OpenAI models to detect potential AI-generated content.
Once it calculates the overall confidence score, it posts a comment on the PR with the findings. We can also configure if we want to apply a label and flag the PR as AI-generated, and optionally fail the workflow when confidence score is high.
/cc @open-telemetry/docs-approvers
Thanks for opening the issue @vitorvasc. I'm hoping that we can find the right balance, whether we use an AI watchdog or opt for something simpler.
But... I'll have to admit that I'd love to see the AI-watch dog's assessment of some of the recent PRs we've been having.
It feels like fighting fire with fire to be honest, but yeah, this is probably what we need to do...
If we institute a threshold policy like Severin suggested in Slack, I think the combination of an AI watchdog and the policy would make it much easier for me personally to push back on low-effort PRs.