Jonathan Bnayahu issues

Repositories
Issues
Comments

Results 3 issues of


                                            Jonathan Bnayahu

Implementation of select safety benchmarks

Implementation of select safety benchmarks used in the MLCommons AI Safety Benchmark (https://mlcommons.org/working-groups/ai-safety/ai-safety/). Based on code at https://github.com/mlcommons/modelgauge. Signed-off-by: Jonathan Bnayahu

Enable summarization by subsets and groups

Safety benchmark

Safety benchmark comprised of AttaQ, ProvoQ, AirBench, and AILuminate, all with Granite Guardian as judge.