awesome-ai-alignment
awesome-ai-alignment copied to clipboard
A curated list of awesome resources for getting-started-with and staying-in-touch-with Artificial Intelligence Alignment research.
** NOTE: April 2023
As of April 2023, there is a lot of new interest in the field of AI Alignment. However, this repo is unmaintained since I gave up hope about solving alignment on-time as a species - almost three years ago.
Maybe [[https://www.aisafetysupport.org/resources/lots-of-links][AI Safety Support]] is one of the definitive resources right now.
I will, however, accept PRs on this repo.
-
Awesome Artificial Intelligence Alignment [[https://github.com/sindresorhus/awesome][https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg]]
Welcome to Awesome-AI Alignment - a curated list of awesome resources for getting-into and staying-in-touch with the research in AI Alignment.
AI Alignment is also known as AI Safety, Beneficial AI, Human-aligned AI, Friendly AI etc.
If you are a newcomer to this field, start with the Crash Course below.
Pull requests are welcome.
** Table of Contents :PROPERTIES: :TOC: this :END:
- [[#awesome-artificial-intelligence-alignment][Awesome Artificial Intelligence Alignment]]
- [[#a-crash-course-for-a-popular-audience][A Crash Course for a Popular Audience]]
- [[#watch-these-two-ted-talks][Watch These Two TED Talks]]
- [[#read-these-blogposts-by-tim-urban][Read These Blogposts by Tim Urban]]
- [[#read-more-about-real-research-on-ai-safety][Read More about Real Research on AI Safety]] - [[#books][Books]] - [[#courses][Courses]] - [[#papers][Papers]]
- [[#research-agendas][Research Agendas]]
- [[#literature-reviews][Literature Reviews]]
- [[#technical-papers][Technical Papers]]
- [[#agent-foundations][Agent Foundations]]
- [[#machine-learning][Machine Learning]] - [[#frameworks-environments][Frameworks/ Environments]] - [[#talks][Talks]]
- [[#popular][Popular]]
- [[#technical][Technical]] - [[#blogposts][Blogposts]] - [[#communities-forums][Communities/ Forums]] - [[#institutes-research-groups][Institutes/ Research Groups]]
- [[#technical-research][Technical Research]]
- [[#policy-and-strategy-research][Policy and Strategy Research]] - [[#podcasts][Podcasts]]
- [[#episodes-in-popular-podcasts][Episodes in Popular Podcasts]]
- [[#dedicated-podcasts][Dedicated Podcasts]] - [[#events][Events]] - [[#newsletters][Newsletters]] - [[#other-lists-like-this][Other Lists Like This]]
** A Crash Course for a Popular Audience *** Watch These Two TED Talks
- [[https://www.youtube.com/watch?v=8nt3edWLgIg][Can we build AI without losing control over it?]] - Sam Harris
- [[https://www.youtube.com/watch?v=MnT1xgZgkpk&t=1s][What happens when our computers get smarter than we are?]] - Nick Bostrom *** Read These Blogposts by Tim Urban
- WaitButWhy on AI Safety: [[https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html][Part 1]] and [[https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html][Part 2]]
- [[http://lukemuehlhauser.com/a-reply-to-wait-but-why-on-machine-superintelligence/][A Reply by Luke Muehlhauser correcting a few things]] *** Read More about Real Research on AI Safety
- [[https://80000hours.org/career-reviews/artificial-intelligence-risk-research/][80000hours.org Profile on AI Safety Research]] ** Books
- [[https://en.wikipedia.org/wiki/Superintelligence%3A_Paths%2C_Dangers%2C_Strategies][Superintelligence: Paths, Dangers, Strategies]] by Nick Bostrom
- [[https://en.wikipedia.org/wiki/Life_3.0][Life 3.0]] by Max Tegmark
- [[https://www.goodreads.com/book/show/39947993-artificial-intelligence-safety-and-security?ac=1&from_search=true][Artificial Intelligence Safety and Security]] by Roman Yampolskiy (Editor) ** Courses
- [[http://inst.eecs.berkeley.edu/~cs294-149/fa18/][CS 294-149: Safety and Control for Artificial General Intelligence (Fall 2018)]] by Andrew Critch and Stuart Russel [UC Berkeley]
- [[https://dorsa.fyi/cs521/][CS 521: Seminar on AI Safety]] by Dorsa Sadigh [Stanford]
** Research Agendas
- [[https://ai-alignment.com/iterated-distillation-and-amplification-157debfd1616][Paul Christiano's Agenda]] summarised by Ajeya Cotra
- [[https://agentfoundations.org/item?id=1816][The Learning-Theoretic AI Alignment Research Agenda]] by Vadim Kosoy
- [[The Learning-Theoretic AI Alignment Research Agenda][MIRI Machine Learning Agenda]] by Jessica Taylor and Eliezer Yudkowsky and Patrick LaVictoire and Andrew Critch
- [[https://intelligence.org/files/TechnicalAgenda.pdf][MIRI Agent Foundations Agenda]] by Nate Soares and Benya Fallenstein
- [[https://arxiv.org/abs/1606.06565][Concrete Problems in AI Safety]] by Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané
- [[https://arxiv.org/pdf/1811.07871.pdf][DeepMind Scalable Agent Alignment Agenda]] by Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, and Shane Legg
- [[https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/][MIRI 2018 Research Directions]] by Nate Soares
- [[https://arxiv.org/abs/1811.03493][Integrative Biological Simulation, Neuropsychology, and AI Safety]] by Gopal P. Sarma, Adam Safron, and Nick J. Hay
- [[https://www.fhi.ox.ac.uk/reframing/][Reframing Superintelligence: Comprehensive AI Services as General Intelligence]] by Eric Drexler ** Literature Reviews
- [[https://arxiv.org/abs/1805.01109][AGI Safety Literature Review]] by Tom Everitt, Gary Lea, Marcus Hutter
- [[www.tomeveritt.se/papers/2018-thesis.pdf][Towards Safe Artificial General Intelligence]] - Tom Everitt's PhD Thesis
- [[https://www.lesswrong.com/posts/a72owS5hz3acBK5xc/2018-ai-alignment-literature-review-and-charity-comparison][2018 AI Alignment Literature Review and Charity Comparison]] by Larks
- [[https://futureoflife.org/ai-policy/][FLI AI Policy Resources]]
- [[https://futureoflife.org/2019/04/11/an-overview-of-technical-ai-alignment-with-rohin-shah-part-1/][An Overview of Technical AI Alignment with Rohin Shah]]
- [[https://docs.google.com/document/d/1FbTuRvC4TFWzGYerTKpBU7FJlyvjeOvVYF2uYNFSlOc/edit][AI Alignment Research Overview]] by Jacob Steinhardt ** Technical Papers *** Agent Foundations *** Machine Learning
** Frameworks/ Environments
- [[https://github.com/JohannesHeidecke/irl-benchmark][IRL-Benchmark]]
- [[https://github.com/deepmind/ai-safety-gridworlds][AI-Safety Gridworlds]]
** Talks *** Popular
- [[https://www.youtube.com/watch?v=8nt3edWLgIg][Can we build AI without losing control over it?]] - Sam Harris (2016)
- [[https://www.youtube.com/watch?v=MnT1xgZgkpk&t=1s][What happens when our computers get smarter than we are?]] - Nick Bostrom (2014)
- [[https://www.youtube.com/watch?v=EBK-a94IFHY&t=940s][3 principles for creating safer AI]] - Stuart Russell (2017)
- [[https://www.youtube.com/watch?v=2LRwvU6gEbA][How to get empowered, not overpowered, by AI]] - Max Tegmark (2018) *** Technical
- [[https://www.youtube.com/watch?v=EUjc1WuyPT8][Eliezer Yudkowsky – AI Alignment: Why It's Hard, and Where to Start]] (2016) ** Blogposts
- [[https://thinkingwires.com/posts/2017-07-05-risks.html][Risks of Artificial Intelligence]] by Johannes Heidecke
- [[https://www.alignmentforum.org/posts/i3BTagvt3HbPMx6PN/embedded-agency-full-text-version][Embedded Agency]] by Scott Garrabrant and Abram Demski
- [[https://www.alignmentforum.org/s/4dHMdK5TLN6xcqtyc][Value Learning Sequence]] by Rohin Shah et al. ** Communities/ Forums
- [[https://www.alignmentforum.org/][Alignment Forum]]
- [[http://aisafety.camp/][RAISE - Road to AI Safety Excellence]]
- [[https://aisafety.com/reading-group/][AI Safety Reading Group]]
** Institutes/ Research Groups *** Technical Research
- [[http://futureoflife.org/][Future of Life Institute]]
- [[https://www.fhi.ox.ac.uk/][Future of Humanity Institute]]
- [[https://intelligence.org/][Machine Intelligence Research Institute]]
- [[https://ought.org/][Ought]]
- [[https://openai.com/][OpenAI]]
- [[https://medium.com/@deepmindsafetyresearch][DeepMind Safety Team]]
- [[https://humancompatible.ai/][Center for Human-Compatible AI]] *** Policy and Strategy Research
** Podcasts *** Episodes in Popular Podcasts
- [[https://twimlai.com/twiml-talk-181-anticipating-superintelligence-with-nick-bostrom/][Nick Bostrom on This Week in Machine Learning & AI]]
- [[https://samharris.org/podcasts/116-ai-racing-toward-brink/][Eliezer Yudkowsky on Waking Up With Sam Harris ]]
- [[https://samharris.org/podcasts/the-dawn-of-artificial-intelligence1/][Stuart Russel on Waking Up With Sam Harris]] *** Dedicated Podcasts
- AI Alignment Podcast by Lucas Perry [Future of Life Institute]
- 80000hours Podcast by Rob Wiblin ** Events
- [[aisafetycamp.com][AI Safety Research Camp]]
- [[http://humanaligned.ai/][Human Aligned AI Summer School]]
** Newsletters
- [[https://rohinshah.com/alignment-newsletter/][Alignment Newsletter]] by Rohin Shah ** Other Lists Like This
- [[https://vkrakovna.wordpress.com/ai-safety-resources/#communities][AI Safety Resources by Victoria Krakovna]]
- [[https://humancompatible.ai/bibliography][CHAI Bibliography]]
- [[https://80000hours.org/ai-safety-syllabus/][80000hours.org Syllabus for AI Safety]]