awesome-llm-security icon indicating copy to clipboard operation
awesome-llm-security copied to clipboard

Add Cybersecurity AI (CAI) paper and tool

Open vmayoral opened this issue 11 months ago • 0 comments

Quick notes:

  • repo: github.com/aliasrobotics/CAI
  • paper: https://www.themoonlight.io/paper/6ce44cc4-ed7e-4e47-8350-9c13f0df9c77
  • abstract:

By 2028 most cybersecurity actions will be autonomous, with humans teleoperating. We present the first classification of autonomy levels in cybersecurity and introduce Cybersecurity AI (CAI), an open-source framework that democratizes advanced security testing through specialized AI agents. Through rigorous empirical evaluation, we demonstrate that CAI consistently outperforms state-of-the-art results in CTF benchmarks, solving challenges across diverse categories with significantly greater efficiency –up to 3,600× faster than humans in specific tasks and averaging 11× faster overall. CAI achieved first place among AI teams and secured a top-20 position worldwide in the "AI vs Human" CTF live Challenge, earning a monetary reward of $750. Based on our results, we argue against LLM-vendor claims about limited security capabilities. Beyond cybersecurity competitions, CAI demonstrates real-world effectiveness, reaching top-30 in Spain and top-500 worldwide on Hack The Box within a week, while dramatically reducing security testing costs by an average of 156×. Our framework transcends theoretical benchmarks by enabling non-professionals to discover significant security bugs (CVSS 4.3-7.5) at rates comparable to experts during bug bounty exercises. By combining modular agent design with seamless tool integration and human oversight (HITL), CAI offers organizations of all sizes access to AI-powered bug bounty testing previously available only to well-resourced firms –thereby challenging the oligopolistic ecosystem currently dominated by major bug bounty platforms.

vmayoral avatar May 11 '25 06:05 vmayoral