SRE topic
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
gecho
Gecho - a HTTP request echo debugging service
command-line-cheat-sheet
📝 A place to quickly lookup commands (bash, vim, git, AWS, Docker, Terraform, Ansible, kubectl)
awesome-sre
A curated list of Site Reliability and Production Engineering resources.
DevOps-README.md
What to Read to Learn More About DevOps
microservice-production-readiness-checklist
The principles that help to deploy safely to the production environment. If you like it:
devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
howtheysre
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
chaostoolkit
Chaos Engineering Toolkit & Orchestration for Developers
chaos-ssm-documents
Collection of AWS SSM Documents to perform Chaos Engineering experiments