The Feynman Guide to System Design Interviews — José María Cabeza Rodríguez

For many software engineers, the System Design Interview (SDI) is the ultimate panic room. You are given a blank whiteboard, a marker, and an incredibly vague, open-ended prompt: "Design YouTube," "Design WhatsApp," or "Design a global rate limiter."

Unlike coding interviews, there is no single "correct" answer, no passing unit tests, and no compiler to tell you when you are wrong. It is a pure test of your ability to handle ambiguity, structure your thoughts, and make engineering tradeoffs under pressure.

But why does it feel so chaotic? Because we often try to solve everything at once.

Let's use the Richard Feynman Technique—breaking down complex ideas into simple, everyday analogies—to demystify the system design interview process and understand how to navigate it step by step.

1. The Panic Room: Why System Design Interviews Terrify Us

In a standard coding interview, you are a builder with a set of bricks, trying to build a specific archway. In a system design interview, you are a city planner standing before a vast, empty plot of land. The interviewer says, "We need a transit network that handles millions of commuters. Go."

Without a map, most engineers do what comes naturally: they start digging. They immediately begin arguing about whether to use PostgreSQL or Cassandra, whether to use Kafka or RabbitMQ, or how to design a DB schema, before they even know what they are building, who is using it, or how much traffic it needs to support.

This is the equivalent of designing a subway tunnel before deciding if the city needs trains, buses, or ferries.

The SDI Whiteboard Chaos Entering the panic room: without a systematic map, the whiteboard quickly turns into a chaotic mess of servers, databases, and wires.

To survive this, you need a compass. You need a structured, repeatable way to translate a vague prompt into a production-grade system design. That's where the four-step framework comes in.

2. The GPS Analogy: A 4-Step Framework

Think of a system design interview like using a GPS navigation system to drive to a destination. If you just start driving without setting the destination, you'll end up lost in the wilderness. You need a structured route:

The GPS Route map A structured route: the 4-step framework acts as a GPS guide, moving systematically from destination definition to detailed tuning.

Step 1: Establish the Destination (Understand the Problem & Design Scope)

Before planning any route, your GPS needs to know exactly where you are going. You cannot build a system unless you know its boundaries. In the interview, this means asking clarifying questions:

Who is using this system?
What are the core features? (e.g., for WhatsApp: "Do we need group chats, or just 1-on-1?")
What is the scale? (e.g., "How many daily active users? How many requests per second?")

Step 2: Plan the High-Level Route (High-Level Design & Buy-In)

Your GPS shows you the big picture: "Take Highway 10 for 50 miles, then transition to Interstate 5." You aren't focusing on individual traffic lights or lane changes yet. In system design, you draw a box diagram with key components: the clients, the DNS, the load balancer, the web servers, and the database. You get the interviewer's agreement on this blueprint before moving forward.

Step 3: Zoom In on the Intersections (Deep Dive)

Now you zoom in on the tricky turns and complicated interchanges. If the GPS detects heavy traffic, you need to route around it. In the interview, you and the interviewer pick the most critical bottleneck—like how to partition the database, how to design a custom ID generator (like Twitter Snowflake), or how the chat server maintains real-time TCP connections (using WebSockets).

Step 4: Arrive and Refine (Wrap Up)

You've arrived at the destination. Now you look back: "Was that the most efficient route? What if there was an accident? How do we handle future detours?" In the interview, you discuss bottlenecks, edge cases, system monitoring, logging, and how to scale from 1 million to 10 million users.

3. The Doctor's Visit: Requirements & Estimation

Imagine walking into a doctor's office with a sore throat, and the doctor immediately hands you a bottle of pills and says, "Take these twice a day, see you next year." No examination. No questions. No diagnosis.

You would immediately walk out. Yet, this is exactly what engineers do when they hear "Design a URL Shortener" and immediately reply, "Okay, we'll use a Redis cache, a Cassandra DB cluster, and MD5 hashing."

You must act like a doctor. You must diagnose the problem before prescribing the architecture.

The Doctor's Diagnosis Analogy The Doctor's Visit: asking the right questions, diagnosing the scale, and listing the constraints before prescribing the tech stack.

Functional vs. Non-Functional Requirements

A doctor separates symptoms (what is wrong) from physical constraints (what the patient's body can handle). In system design:

Functional Requirements (The Symptoms): What the system must do. For a URL shortener: "Generate a short URL from a long one," "Redirect users to the original URL," and "Allow links to expire."
Non-Functional Requirements (The Constraints): How well the system must perform. "High availability (99.9% uptime)," "Low latency redirects (under 100ms)," and "Data durability (shortened links must not be lost)."

Back-of-the-Envelope Estimation

Once you have the requirements, you run quick math to understand the scale:

If you have 100 million daily active users (DAU), and each user writes 1 post per day, that's $100,000,000 / 86,400 \approx 1,160$ writes per second.
If each post is 100 bytes, you need $100,000,000 \times 100 \text = 10 \text$ of storage per day.
Understanding these numbers tells you whether your database can run on a single machine (which can easily handle 10 GB/day) or if you need a distributed storage system.

4. The City Blueprint: High-Level Design

Once you understand the requirements, you draw the City Blueprint. You lay out the main districts and how they connect.

City Master Plan Blueprint The City Blueprint: mapping out the key functional zones of your application and connecting them with clear protocols.

Every modern system design relies on a standard set of building blocks:

The City Gate (Load Balancer): Distributes incoming traffic evenly across multiple servers to prevent any single server from becoming overwhelmed.
The Warehouse (Databases): Relational databases (like PostgreSQL) for structured data with strict relationships, or Non-Relational databases (like Cassandra/MongoDB) for unstructured, high-write data.
The Convenience Store (Cache): Stores frequently accessed data (like popular user profiles) in memory (using Redis or Memcached) to serve reads instantly and relieve stress on the database.
The Local Distribution Center (CDN): Serves static files (images, videos, HTML) from servers geographically close to the users, cutting down latency.
The Post Office (Message Queues): Decouples services by sending messages asynchronously (using Kafka or RabbitMQ) so that if one service goes down, the rest of the system can keep running.

5. The X-Ray Machine: Deep Dive & Tradeoffs

Once the high-level design is approved, it's time to turn on the X-Ray machine. You pick the most critical organ of your design and look inside. This is where you demonstrate your core engineering depth.

And in system design, every decision is a tradeoff. There is no "perfect" technology; there is only the right tool for the current constraints.

X-Ray Machine Tradeoffs The X-Ray Deep Dive: peeling back the layers to analyze internal bottlenecks and evaluate architectural tradeoffs.

Here are the four most common tradeoffs you'll navigate:

1. Consistency vs. Availability (The CAP Theorem)

Consistency (CP): Every read receives the most recent write or an error. (e.g., Bank transaction systems where balances must be 100% accurate).
Availability (AP): Every request receives a non-error response, without the guarantee that it contains the most recent write. (e.g., Social media feeds where it's fine if a post takes a few seconds to show up to friends).

2. SQL vs. NoSQL

SQL (Relational): Perfect for ACID transactions and complex JOIN operations. Difficult to scale horizontally.
NoSQL (Non-Relational): Highly scalable horizontally, great for high-volume writes and flexible schemas. Harder to perform complex queries and multi-record transactions.

3. Horizontal vs. Vertical Scaling

Vertical (Scale Up): Adding more CPU/RAM to a single server. Simple to manage but has a hard physical limit and creates a Single Point of Failure (SPOF).
Horizontal (Scale Out): Adding more machines to the pool. Offers infinite scale and high availability, but introduces network complexity, data consistency issues, and routing overhead.

4. Synchronous vs. Asynchronous Communication

Synchronous (HTTP/gRPC): Client waits for a response before continuing. Simple, but blocks resources and couples services together.
Asynchronous (Queues/Events): Client sends a message and continues immediately. Highly resilient and decoupled, but introduces eventual consistency and debugging complexity.

The Verdict

System design interviews are not a trivia contest about who knows the most buzzwords. They are an interactive session where you show how you think, how you collaborate, and how you evaluate tradeoffs.

The next time you face a system design problem:

Stop and listen to the requirements (be a doctor).
Set the destination before you drive (use the GPS framework).
Draw the blueprint before building (map the city districts).
Examine the tradeoffs under the X-Ray machine (there are no perfect systems).

By walking through these steps systematically, you turn a terrifying panic room into a structured, manageable exercise in architectural engineering.

References & Further Reading

This guide synthesizes core methodologies and patterns from the defining literature on system design interviews:

System Design Interview – An Insider's Guide (Volume 1 & 2) by Alex Xu and Sahn Lam.
Acing the System Design Interview by Zhiyong Tan.
Algomasterio System Design Interview Handbook by Ashish Pratap Singh.
Designing Data-Intensive Applications by Martin Kleppmann.
The C4 Model for Visualising Software Architecture by Simon Brown.