Understanding the qualities that define how well a system behaves
In system design, there's no perfect solution. Every decision is a trade-off.
Want high speed? You might sacrifice consistency.
Want high availability? You might accept stale data.
To make smart trade-offs — especially in interviews — you need to understand the qualities we're balancing. These qualities are called Non-Functional Requirements (NFRs).
They don't describe what the system does. They describe how well it behaves.
Is the system up and reachable when users need it?
Availability is often expressed in "nines":
Google Search is available almost all the time. Even if one data center fails, traffic is rerouted instantly.
Why it matters: If your system isn't available, users can't use it — no matter how good your features are.
Does the system behave correctly over time?
A calculator that's always on but sometimes gives wrong answers is unreliable.
When you order something from Amazon, you expect it to be placed once, charged once, and shipped once — no duplicates or silent failures.
Why it matters: A system that's unreliable breaks trust — and trust is everything in software.
Can the system handle growth — more users, more data, more traffic?
Add more power to one machine (like upgrading your laptop)
Add more machines and split the work (like hiring more chefs in a busy kitchen)
Instagram handles billions of photo uploads. It scales horizontally across servers and regions.
Why it matters: Scalability ensures your system doesn't collapse when demand grows.
How much work can your system handle per second? It's about volume — not speed.
A toll booth that processes 100 cars per minute has high throughput. If it only handles 5, traffic jams.
Stripe processes thousands of payments every second. YouTube streams petabytes of video every minute.
Why it matters: High throughput means your system can handle heavy traffic without choking.
Can the system survive failures and keep running?
A bridge that still works even if one pillar collapses — that's fault tolerance.
Netflix randomly shuts down servers to test fault tolerance. Their system keeps working even when parts fail.
Why it matters: Failures are inevitable. Fault tolerance makes sure they don't take down your whole system.
How easy is it to fix, change, or improve the system?
A car with modular parts is easier to repair than one that needs a specialist for every fix.
GitHub uses modular microservices — teams can update features independently. Enterprise CRMs must be easy to customize and debug.
Why it matters: Maintainable systems evolve faster, break less, and are easier to debug.
| NFR A | Related To | Why They're Linked |
|---|---|---|
| Availability | Fault Tolerance | Faults must be handled to stay available |
| Reliability | Fault Tolerance | Surviving failures keeps data safe |
| Scalability | Throughput | More machines = more capacity |
| Maintainability | Reliability | Easier fixes = fewer bugs |
Non-functional requirements are the compass of system design. They help us build systems that aren't just functional — they're fast, reliable, scalable, and resilient. As we move forward in this course, we'll explore the techniques that help us achieve these goals — one trade-off at a time.