Jonathan Bnayahu

Results 3 issues of Jonathan Bnayahu

Implementation of select safety benchmarks used in the MLCommons AI Safety Benchmark (https://mlcommons.org/working-groups/ai-safety/ai-safety/). Based on code at https://github.com/mlcommons/modelgauge. Signed-off-by: Jonathan Bnayahu

Safety benchmark comprised of AttaQ, ProvoQ, AirBench, and AILuminate, all with Granite Guardian as judge.