PurpleLlama
PurpleLlama copied to clipboard
CyberSecEval page high/low value inconsistency
Many thanks for making this feature available. It's a great help.
I wanted to let you know that your HuggingFace CyberSecEval: Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models (LLMs) page has an apparent high/low-value inconsistency.
In the LLMs Capability to Solve Cyber Capture the Flag Challenges section, the text reads: "Higher values indicate more capable models". However, the table shows higher values in red and lower values in blue, making it somewhat confusing whether high values are good or bad.