[Ongoing] Knowledge base additions
- ~executive order on AI~
- ~NIST 800-30 rev1 https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-30r1.pdf~
- ~IEEE 1012 (199X or 2016) https://people.eecs.ku.edu/~hossein/Teaching/Stds/1012.pdf~
- ~https://www.uspto.gov/sites/default/files/documents/USPTO_AI-Report_2020-10-07.pdf~
- ~https://www.commerce.gov/issues/intellectual-property (see if you think it misses the mark, it might b/c I don't see an AI focus)~
- ~https://standards.ieee.org/ieee/3119/10729/~
~https://www.frontiermodelforum.org/uploads/2023/10/FMF-AI-Red-Teaming.pdf~
~https://github.com/openai/openai-cookbook/tree/main~
~https://resources.oreilly.com/examples/0636920415947/-/blob/master/Attack_Cheat_Sheet.png <- community resources~
All added. Waiting on EO. Decided to go ahead and add the "Intellectual property" page because I could still imagine it being a useful resource/portal (especially considering the USTPO falls under it, and that contains a specific resource we link to).
~https://www.imda.gov.sg/resources/press-releases-factsheets-and-speeches/press-releases/2023/generative-ai-evaluation-sandbox <- GAI resources~
[ALL ADDED, 2/21/2024]
benchmarks:
https://wavesbench.github.io/ https://github.com/huggingface/evaluate https://github.com/AI-secure/DecodingTrust https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vQObeTxvXtOs--zd98qG2xBHHuTTJOyNISBJPthZFr3at2LCrs3rcv73d4of1A78JV2eLuxECFXJY43/pubhtml https://safetyprompts.com/ python software:
https://github.com/lilacai/lilac official guidance:
https://www.ohchr.org/sites/default/files/documents/issues/business/b-tech/taxonomy-GenAI-Human-Rights-Harms.pdf community resources:
https://www.hackerone.com/vulnerability-and-security-testing-blog https://www.synack.com/wp-content/uploads/2022/09/Crowdsourced-Security-Landscape-Government.pdf CSET stuff (just double check we reference somehow): -- https://cset.georgetown.edu/article/translating-ai-risk-management-into-practice/ -- https://cset.georgetown.edu/publication/repurposing-the-wheel/ -- https://cset.georgetown.edu/publication/adding-structure-to-ai-harm/ -- https://cset.georgetown.edu/article/understanding-ai-harms-an-overview/ -- https://cset.georgetown.edu/publication/ai-incident-collection-an-observational-study-of-the-great-ai-experiment/ https://www.scsp.ai/wp-content/uploads/2023/11/SCSP_JHU-HCAI-Framework-Nov-6.pdf https://openai.com/research/building-an-early-warning-system-for-llm-aided-biological-threat-creation https://c2pa.org/ https://aiverifyfoundation.sg/downloads/Cataloguing_LLM_Evaluations.pdf https://partnershiponai.org/modeldeployment/ https://cdn.openai.com/openai-preparedness-framework-beta.pdf
https://dominiquesheltonleipzig.com/country-legislation-frameworks/
red-teaming section:
https://www.hackerone.com/thought-leadership/ai-safety-red-teaming https://cset.georgetown.edu/article/what-does-ai-red-teaming-actually-mean/
Red teaming -- but do we want to start hosting papers?
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal (2024) Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendryckshttps://arxiv.org/pdf/2402.04249.pdf
Red-Teaming for Generative AI: Silver Bullet or Security Theater? Michael Feffer, Anusha Sinha, Zachary C. Lipton, Hoda Heidarihttps://arxiv.org/pdf/2401.15897.pdf
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yanghttps://arxiv.org/pdf/2310.00322.pdf
Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment (2023)https://arxiv.org/pdf/2308.09662.pdf
Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases Rishabh Bhardwaj, Soujanya Poriahttps://arxiv.org/pdf/2310.14303.pdf
GAI Critiques:
- reasoning gap: https://arxiv.org/pdf/2402.19450.pdf
- stealing language models: https://arxiv.org/pdf/2403.06634.pdf
- dialect prejudice: https://arxiv.org/pdf/2403.00742.pdf