PurpleLlama CyberSecEval page high/low value inconsistency

CyberSecEval page high/low value inconsistency

Open Arinbjarnar opened this issue 4 months ago • 0 comments

Many thanks for making this feature available. It's a great help.

I wanted to let you know that your HuggingFace CyberSecEval: Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models (LLMs) page has an apparent high/low-value inconsistency.

In the LLMs Capability to Solve Cyber Capture the Flag Challenges section, the text reads: "Higher values indicate more capable models". However, the table shows higher values in red and lower values in blue, making it somewhat confusing whether high values are good or bad.

Screenshot 2024-09-25 132746 highlights

Sep 25 '24 12:09 Arinbjarnar

PurpleLlama PurpleLlama copied to clipboard

CyberSecEval page high/low value inconsistency

PurpleLlama
PurpleLlama copied to clipboard