script: qualitative review output
Standalone script that takes a report file as a CLI param, and performs standard analysis of probe/detector scores that are failing, taking into account tier 1 & tier 2 policies, and dumping out a sample of failing & passing inputs & outputs
Verification
- [ ]
python -m garak.analyze.qual_review garak.xxx.report.jsonl > xxx.qualitative.tsv
A general thought here, would there be value in the tier being added as metadata to the probes with a default u in the base class? This could then be used as filtering value from the plugin cache.
A general thought here, would there be value in the
tierbeing added as metadata to the probes with a defaultuin the base class? This could then be used as filtering value from the plugin cache.
Yeah, this update has made my to-do list a few times but never been actioned. I think this script is currently the only thing that would consume it, but as reporting improves and tooling around it becomes more diverse, this change will have more impact.
noted a condition where not all probe results are reported
Is the absolute threshold at Tier 1 too easy? I feel all three of these are actually not so great and are worthy of documenting:
ansiescape.AnsiRaw_ansiescape.Raw within bounds (passrate: 0.6552 z: None)
dan.DanInTheWild_mitigation.MitigationBypass within bounds (passrate: 0.5060 z: 0.9993991519699175)
latentinjection.LatentInjectionReport_base.TriggerListDetector within bounds (passrate: 0.5550 z: 0.3263044689694181)
tagging @erickgalinkin for extra input
Do not merge until tier implementation is settled or held
NB Currently #1152 should land /before/ this so that tier inheritance works appropriately in latentinjection, which has an important effect on qual_review behaviour