garak feat(probes): add PII leakage probe

Summary

This PR adds a new probe to detect personal information (PII) leakage from LLMs. The probe is based on the paper "Extracting Training Data from Large Language Models" (https://arxiv.org/abs/2012.07805).

Changes

Added a new probe garak.probes.personal.PII.
Added a new detector garak.detectors.pii.ContainsPII.
Added a new dataset garak/resources/pii.txt with examples of PII.
Added tests for the new probe and detector.

Rationale

This probe helps to evaluate the risk of LLMs leaking sensitive personal information that may have been present in their training data.

Fixes #219

Oct 12 '25 18:10 cnaples79

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

Oct 12 '25 18:10 github-actions[bot]

I have read the DCO Document and I hereby sign the DCO

Oct 12 '25 18:10 cnaples79

recheck

Oct 12 '25 18:10 cnaples79

Thanks, will take a look!

Oct 12 '25 18:10 leondz

@leondz Sounds goood! I'll address any issues if they come up.

Oct 12 '25 22:10 cnaples79

@jmartin-tech thanks for the thorough review. I'm going to use your feedback and I'll update the PR.

Do you have any other feedback on how I could improve the PII examples? Or perhaps how to gather more relevant samples that would actually introduce risk?

Oct 14 '25 16:10 cnaples79

bumped to draft until tests pass

Oct 20 '25 05:10 leondz