designdeck icon indicating copy to clipboard operation
designdeck copied to clipboard

An Open-Source Collection of 230+ Flash Cards to Help You Succeed in Your System Design Interview and More 💯

Overview

Design Deck is an open-source collection of 230+ system design flash cards.

It helps you prepare and succeed in your system design interview.

The topics covered are the following:

  • Cache: eviction, locations, strategies, when to use a cache, etc.
  • Cloud: main cloud components
  • Database: ACID, CAP, partitioning, consistency, isolation, denormalization, etc.
  • Design: general topics such as idempotence, bloom filter, causality, asynchronous communications, vector clocks
  • HTTP: main HTTP knowledge
  • Kafka: main Kafka building blocks
  • Math: discrete math
  • Network: TCP, CORS, ping & heartbeat, OSI, etc.
  • Reliability: reliability patterns
  • Security: main security knowledge such as TLS, cipher, encryption

Anki Deck

Anki is a free software (Windows/Mac/Linux/iPhone/Android) which makes remembering things easy. It utilizes spaced repetition which is a proven technique to increase the rate of memorization:

Spaced Repetition: The most powerful study technique on YouTube

The single biggest change that Anki brings about is that it means memory is no longer a haphazard event, to be left to chance. Rather, it guarantees I will remember something, with minimal effort. That is, Anki makes memory a choice.

Michael A. Nielsen, "Augmenting Long-term Memory"

Using Anki is a great way to prepare your system design interview. Here is a flashcard example:

The Anki version (a clone of the +200 flashcards from this repo) is available via a one-time GitHub sponsorship tier for $21: ❤️ Sponsor, One-time tab, Access to the latest Anki deck version of Design Deck tier.

Cards Index

Cache

  • Cache aside
  • Cache aside vs. read-through
  • Cache eviction policy
  • Cache locations
  • Cache: refresh-ahead
  • Cache: write through vs. write back
  • Four main distributed cache benefits
  • Main metric for cache
  • Read-through cache
  • When to use a cache

Cloud

  • CDN

DB

  • 3 main reasons to partition data
  • ACID property
  • Anti-entropy
  • Byzantine fault-tolerant
  • CALM theorem
  • CAP theorem
  • Caveat of serializability
  • Chain replication
  • Chain replication vs. consensus
  • Change data capture (CDC)
  • Concurrency control
  • Consensus
  • Consistency models
  • CQRS
  • CRDT
  • CRDT and collaborative applications (e.g., Google Docs)
  • DB indexes tradeoff
  • DB internal components
  • DB: read vs. write-heavy, latency vs. consistency, availability vs. consistency, ACID vs. non-ACID
  • Delta CRDTs
  • Denormalization
  • Design consideration when partitioning data
  • Downside of distributed transactions
  • Event sourcing
  • Eventual consistency requirements
  • Examples of solutions offering leader election abstractions
  • Federation
  • Fencing token
  • Gossip protocol
  • Graph DB main use case
  • Hinted handoff
  • Hot spot in partitioning
  • In a database, strategy to handle rebalancing
  • Isolation levels
  • Known CRDTs
  • Last-write-wins (LWW)
  • Leader election
  • LSM tree
  • LSM tree vs. B-tree
  • Main difference between consistency models and isolation levels
  • Merkle tree
  • Monotonic reads consistency implementation
  • MVCC
  • N+1 select problem
  • NoSQL: main types and main architecture principles
  • Operation-based CRDTs
  • Operational transformation (OT): concept and main drawback
  • Optimistic concurrency control: pros and cons
  • PACELC theorem
  • Partitioning (sharding)
  • Partitioning criteria
  • Partitioning methods
  • Quorum
  • Raft
  • Read repair
  • Relation between replication factor, write consistency and read consistency
  • Replication vs. partition: impacts
  • Schema-on-read vs. schema-on-write
  • Serializability
  • Serializable Snapshot Isolation (SSI)
  • Single-leader, multi-leader, leaderless replication
  • Sloppy quorum
  • Snapshot Isolation (SI)
  • Snapshot Isolation common implementation
  • SSTable
  • State-based CRDTs: definition and requirements
  • Strong eventual consistency: definition and requirements
  • Three-phase commit (3PC)
  • Transaction
  • Two main approaches to partition a table that has secondary indexes
  • Two types of CRDTs
  • Two-phase commit (2PC)
  • WAL
  • When relational vs. when document
  • When to use a column-oriented store
  • Why DB schemaless is misleading
  • Why is in-memory faster
  • Write and read amplification
  • Write heavy and replication type

Design

  • Auditing
  • Backward vs. forward compatibility
  • Bloom filter
  • Causality
  • Concurrent operations
  • Consistent hashing
  • Design impacts of sharing
  • Design: read-heavy vs. write-heavy impacts
  • Different types of message failure
  • Event log vs. message queue
  • Exactly-once delivery
  • FLP impossibility
  • Geohashing
  • Hashing definition and size of MD5 and SHA256
  • HDFS
  • How to reduce sharing
  • HyperLogLog
  • Idempotent
  • Latency numbers every programmer should know
  • Lease
  • Least loaded endpoint load balancing strategy
  • Liveness property
  • Load balancing
  • Load shedding
  • Locality
  • Log
  • Log compaction
  • Main drawback of shared-nothing architectures
  • MapReduce
  • Microservices: pros and cons
  • Number of values to generate to reach 50% chances of collision: 32-bit, 64-bit, and 128-bit hash
  • Orchestration vs. choreography
  • Outbox pattern
  • Perfect hashing
  • Quadtree
  • Rate-limiting (throttling): definition and algos
  • Rebalancing
  • REST
  • REST vs. gRPC
  • Safety property
  • Saga
  • Scalability
  • Scalability ceiling
  • Shared-nothing architectures
  • Source of truth
  • Split-brain
  • Throughput
  • Total vs. partial order
  • UUID
  • Validation vs. verification
  • Vector clock
  • Why asynchronous communication

HTTP

  • 301 vs. 302
  • 403 or 404?
  • Cookie
  • Four main HTTP/2 features
  • HLS
  • HTTP
  • HTTP cache-control header
  • HTTP Etag
  • HTTP keep-alive
  • HTTP methods: safeness and idempotence
  • HTTP safe method
  • HTTP status code 429
  • HTTP status codes
  • What happens if you type google.com in your browser

Kafka

  • Consumer types
  • Durability/availability and latency/throughput tradeoffs
  • Log compaction
  • Offset
  • Partition
  • Partition distribution
  • Rebalancing
  • Segment
  • Shared subscription

Math

  • Associative property
  • Commutative property
  • Harmonic mean

Network

  • ARP protocol
  • Average connection speed in USA
  • Backpressure
  • Bandwidth
  • BGP
  • CORS
  • Difference ping & heartbeat
  • Difference TCP & UDP
  • Difference view & materialized view
  • DNS
  • DNS lookup: push or pull
  • Health checks: passive vs. active
  • Internet model
  • Layer 4 vs. layer 7 load balancer
  • MAC address
  • Max size of a TCP packet
  • MQTT LWT
  • NTP
  • OSI model
  • Routers
  • Routers buffering
  • Routers processing
  • Routing table
  • Service mesh
  • Switch
  • TCP congestion control
  • TCP connection backlog
  • TCP flow control
  • TCP handshake
  • Websocket
  • Why can't we rely on the system clock in distributed systems?

Reliability

  • Bulkhead pattern
  • Cascading failure
  • Causal consistency implementation
  • Circuit breaker
  • Exponential backoff
  • Fault tolerance
  • Jitter
  • Knee point
  • Phi-accrual failure detector
  • Retry amplification

Security

  • Authentication
  • Certificate authorities
  • Cipher
  • Confidentiality
  • Integrity
  • Mutual TLS
  • OAuth 2
  • Public key infrastructure (PKI)
  • TLS handshake
  • Two main uses of encryption
  • Two types of encryption
  • What does digital signature provide
  • What does TLS provide?

References

Additional Notes

If you're preparing an algorithm & data structure interview, you can take a look at Algo Deck