awesome-ai-alignment icon indicating copy to clipboard operation
awesome-ai-alignment copied to clipboard

A curated list of awesome resources for getting-started-with and staying-in-touch-with Artificial Intelligence Alignment research.

** NOTE: April 2023

As of April 2023, there is a lot of new interest in the field of AI Alignment. However, this repo is unmaintained since I gave up hope about solving alignment on-time as a species - almost three years ago.

Maybe [[https://www.aisafetysupport.org/resources/lots-of-links][AI Safety Support]] is one of the definitive resources right now.

I will, however, accept PRs on this repo.

  • Awesome Artificial Intelligence Alignment [[https://github.com/sindresorhus/awesome][https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg]]

    Welcome to Awesome-AI Alignment - a curated list of awesome resources for getting-into and staying-in-touch with the research in AI Alignment.

    AI Alignment is also known as AI Safety, Beneficial AI, Human-aligned AI, Friendly AI etc.

    If you are a newcomer to this field, start with the Crash Course below.

    Pull requests are welcome.

** Table of Contents :PROPERTIES: :TOC: this :END:

  • [[#awesome-artificial-intelligence-alignment][Awesome Artificial Intelligence Alignment]] - [[#a-crash-course-for-a-popular-audience][A Crash Course for a Popular Audience]]
    • [[#watch-these-two-ted-talks][Watch These Two TED Talks]]
    • [[#read-these-blogposts-by-tim-urban][Read These Blogposts by Tim Urban]]
    • [[#read-more-about-real-research-on-ai-safety][Read More about Real Research on AI Safety]] - [[#books][Books]] - [[#courses][Courses]] - [[#papers][Papers]]
    • [[#research-agendas][Research Agendas]]
    • [[#literature-reviews][Literature Reviews]]
    • [[#technical-papers][Technical Papers]]
    • [[#agent-foundations][Agent Foundations]]
    • [[#machine-learning][Machine Learning]] - [[#frameworks-environments][Frameworks/ Environments]] - [[#talks][Talks]]
    • [[#popular][Popular]]
    • [[#technical][Technical]] - [[#blogposts][Blogposts]] - [[#communities-forums][Communities/ Forums]] - [[#institutes-research-groups][Institutes/ Research Groups]]
    • [[#technical-research][Technical Research]]
    • [[#policy-and-strategy-research][Policy and Strategy Research]] - [[#podcasts][Podcasts]]
    • [[#episodes-in-popular-podcasts][Episodes in Popular Podcasts]]
    • [[#dedicated-podcasts][Dedicated Podcasts]] - [[#events][Events]] - [[#newsletters][Newsletters]] - [[#other-lists-like-this][Other Lists Like This]]

** A Crash Course for a Popular Audience *** Watch These Two TED Talks

  • [[https://www.youtube.com/watch?v=8nt3edWLgIg][Can we build AI without losing control over it?]] - Sam Harris
  • [[https://www.youtube.com/watch?v=MnT1xgZgkpk&t=1s][What happens when our computers get smarter than we are?]] - Nick Bostrom *** Read These Blogposts by Tim Urban
  • WaitButWhy on AI Safety: [[https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html][Part 1]] and [[https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html][Part 2]]
  • [[http://lukemuehlhauser.com/a-reply-to-wait-but-why-on-machine-superintelligence/][A Reply by Luke Muehlhauser correcting a few things]] *** Read More about Real Research on AI Safety
  • [[https://80000hours.org/career-reviews/artificial-intelligence-risk-research/][80000hours.org Profile on AI Safety Research]] ** Books
  • [[https://en.wikipedia.org/wiki/Superintelligence%3A_Paths%2C_Dangers%2C_Strategies][Superintelligence: Paths, Dangers, Strategies]] by Nick Bostrom
  • [[https://en.wikipedia.org/wiki/Life_3.0][Life 3.0]] by Max Tegmark
  • [[https://www.goodreads.com/book/show/39947993-artificial-intelligence-safety-and-security?ac=1&from_search=true][Artificial Intelligence Safety and Security]] by Roman Yampolskiy (Editor) ** Courses
  • [[http://inst.eecs.berkeley.edu/~cs294-149/fa18/][CS 294-149: Safety and Control for Artificial General Intelligence (Fall 2018)]] by Andrew Critch and Stuart Russel [UC Berkeley]
  • [[https://dorsa.fyi/cs521/][CS 521: Seminar on AI Safety]] by Dorsa Sadigh [Stanford] ** Research Agendas
    • [[https://ai-alignment.com/iterated-distillation-and-amplification-157debfd1616][Paul Christiano's Agenda]] summarised by Ajeya Cotra
    • [[https://agentfoundations.org/item?id=1816][The Learning-Theoretic AI Alignment Research Agenda]] by Vadim Kosoy
    • [[The Learning-Theoretic AI Alignment Research Agenda][MIRI Machine Learning Agenda]] by Jessica Taylor and Eliezer Yudkowsky and Patrick LaVictoire and Andrew Critch
    • [[https://intelligence.org/files/TechnicalAgenda.pdf][MIRI Agent Foundations Agenda]] by Nate Soares and Benya Fallenstein
    • [[https://arxiv.org/abs/1606.06565][Concrete Problems in AI Safety]] by Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané
    • [[https://arxiv.org/pdf/1811.07871.pdf][DeepMind Scalable Agent Alignment Agenda]] by Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, and Shane Legg
    • [[https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/][MIRI 2018 Research Directions]] by Nate Soares
    • [[https://arxiv.org/abs/1811.03493][Integrative Biological Simulation, Neuropsychology, and AI Safety]] by Gopal P. Sarma, Adam Safron, and Nick J. Hay
    • [[https://www.fhi.ox.ac.uk/reframing/][Reframing Superintelligence: Comprehensive AI Services as General Intelligence]] by Eric Drexler ** Literature Reviews
    • [[https://arxiv.org/abs/1805.01109][AGI Safety Literature Review]] by Tom Everitt, Gary Lea, Marcus Hutter
    • [[www.tomeveritt.se/papers/2018-thesis.pdf][Towards Safe Artificial General Intelligence]] - Tom Everitt's PhD Thesis
    • [[https://www.lesswrong.com/posts/a72owS5hz3acBK5xc/2018-ai-alignment-literature-review-and-charity-comparison][2018 AI Alignment Literature Review and Charity Comparison]] by Larks
    • [[https://futureoflife.org/ai-policy/][FLI AI Policy Resources]]
    • [[https://futureoflife.org/2019/04/11/an-overview-of-technical-ai-alignment-with-rohin-shah-part-1/][An Overview of Technical AI Alignment with Rohin Shah]]
    • [[https://docs.google.com/document/d/1FbTuRvC4TFWzGYerTKpBU7FJlyvjeOvVYF2uYNFSlOc/edit][AI Alignment Research Overview]] by Jacob Steinhardt ** Technical Papers *** Agent Foundations *** Machine Learning

** Frameworks/ Environments

  • [[https://github.com/JohannesHeidecke/irl-benchmark][IRL-Benchmark]]
  • [[https://github.com/deepmind/ai-safety-gridworlds][AI-Safety Gridworlds]]

** Talks *** Popular

  • [[https://www.youtube.com/watch?v=8nt3edWLgIg][Can we build AI without losing control over it?]] - Sam Harris (2016)
  • [[https://www.youtube.com/watch?v=MnT1xgZgkpk&t=1s][What happens when our computers get smarter than we are?]] - Nick Bostrom (2014)
  • [[https://www.youtube.com/watch?v=EBK-a94IFHY&t=940s][3 principles for creating safer AI]] - Stuart Russell (2017)
  • [[https://www.youtube.com/watch?v=2LRwvU6gEbA][How to get empowered, not overpowered, by AI]] - Max Tegmark (2018) *** Technical
  • [[https://www.youtube.com/watch?v=EUjc1WuyPT8][Eliezer Yudkowsky – AI Alignment: Why It's Hard, and Where to Start]] (2016) ** Blogposts
  • [[https://thinkingwires.com/posts/2017-07-05-risks.html][Risks of Artificial Intelligence]] by Johannes Heidecke
  • [[https://www.alignmentforum.org/posts/i3BTagvt3HbPMx6PN/embedded-agency-full-text-version][Embedded Agency]] by Scott Garrabrant and Abram Demski
  • [[https://www.alignmentforum.org/s/4dHMdK5TLN6xcqtyc][Value Learning Sequence]] by Rohin Shah et al. ** Communities/ Forums
  • [[https://www.alignmentforum.org/][Alignment Forum]]
  • [[http://aisafety.camp/][RAISE - Road to AI Safety Excellence]]
  • [[https://aisafety.com/reading-group/][AI Safety Reading Group]]

** Institutes/ Research Groups *** Technical Research

  • [[http://futureoflife.org/][Future of Life Institute]]
  • [[https://www.fhi.ox.ac.uk/][Future of Humanity Institute]]
  • [[https://intelligence.org/][Machine Intelligence Research Institute]]
  • [[https://ought.org/][Ought]]
  • [[https://openai.com/][OpenAI]]
  • [[https://medium.com/@deepmindsafetyresearch][DeepMind Safety Team]]
  • [[https://humancompatible.ai/][Center for Human-Compatible AI]] *** Policy and Strategy Research

** Podcasts *** Episodes in Popular Podcasts

  • [[https://twimlai.com/twiml-talk-181-anticipating-superintelligence-with-nick-bostrom/][Nick Bostrom on This Week in Machine Learning & AI]]
  • [[https://samharris.org/podcasts/116-ai-racing-toward-brink/][Eliezer Yudkowsky on Waking Up With Sam Harris ]]
  • [[https://samharris.org/podcasts/the-dawn-of-artificial-intelligence1/][Stuart Russel on Waking Up With Sam Harris]] *** Dedicated Podcasts
  • AI Alignment Podcast by Lucas Perry [Future of Life Institute]
  • 80000hours Podcast by Rob Wiblin ** Events
  • [[aisafetycamp.com][AI Safety Research Camp]]
  • [[http://humanaligned.ai/][Human Aligned AI Summer School]]

** Newsletters

  • [[https://rohinshah.com/alignment-newsletter/][Alignment Newsletter]] by Rohin Shah ** Other Lists Like This
  • [[https://vkrakovna.wordpress.com/ai-safety-resources/#communities][AI Safety Resources by Victoria Krakovna]]
  • [[https://humancompatible.ai/bibliography][CHAI Bibliography]]
  • [[https://80000hours.org/ai-safety-syllabus/][80000hours.org Syllabus for AI Safety]]