Top 30 System Design Interview Questions for 2026

Last Updated: March 2026 | Level: Mid-Level to Senior | Read Time: ~25 min

System design interviews test your ability to design scalable, reliable, and maintainable systems. These 30 questions cover the most commonly asked design problems at FAANG, unicorns, and top product companies. System design is built on strong fundamentals — make sure you've mastered Data Structures Interview Questions 2026 and Networking Interview Questions 2026 before diving deep into architecture problems.

How to Approach System Design Questions

STAR-D Framework:

Scope: Clarify requirements (functional & non-functional), estimate scale
Top-Level Design: Draw high-level architecture (clients, load balancer, services, DB)
API Design: Define core endpoints
Refine: Deep-dive into critical components
Discuss: Trade-offs, bottlenecks, failure scenarios

Key metrics to always estimate:

DAU (Daily Active Users), QPS (Queries Per Second)
Storage requirements (per item × total items × years)
Read:Write ratio

Classic Design Problems (Q1–Q10)
Data & Storage Systems (Q11–Q18)
Infrastructure & Reliability (Q19–Q25)
Advanced & Specialized (Q26–Q30)

Classic Design Problems

Q1. Design a URL Shortener (like bit.ly) `Medium`

Problem: Build a service that converts long URLs to short ones (e.g., bit.ly/abc123) and redirects users.

Key Components:

Client → Load Balancer → API Servers → Cache (Redis)
                                    ↓
                              URL Database (MySQL)
                                    ↓
                           ID Generator (Snowflake)

Core API:

POST /shorten    {url: "https://long-url.com/..."}  → {short_url: "bit.ly/abc123"}
GET  /{shortCode}                                    → 301/302 Redirect

Short Code Generation:

Base62 encoding: 62^7 = 3.5 trillion unique codes (7 chars → enough for years)
Counter-based: Use distributed ID generator (Snowflake) → encode to Base62
Hash-based: MD5 of URL → take first 7 chars → check collision in DB

Database Schema:

CREATE TABLE urls (
    id          BIGINT PRIMARY KEY,
    short_code  VARCHAR(8) UNIQUE INDEX,
    long_url    TEXT NOT NULL,
    created_at  TIMESTAMP,
    expires_at  TIMESTAMP,
    click_count BIGINT DEFAULT 0
);

Scale Considerations:

100M URLs created/day → ~1,157 writes/sec
10B redirects/day → ~115,000 reads/sec (heavy read bias 100:1)
Cache top 20% URLs (80% of traffic) in Redis: ~20GB RAM
CDN for geographically distributed redirects

Trade-offs:

Decision	Option A	Option B
Redirect type	301 (cached by browser, no analytics)	302 (always hits server, enables analytics)
Short code	Sequential (predictable, easy)	Random (secure, no enumeration)
Custom aliases	Simple DB uniqueness check	May require bloom filter for scale
Expiration	Lazy deletion on access	Cron job for bulk cleanup

Q2. Design a Chat Application (like WhatsApp) `Hard`

Problem: Build a real-time messaging app supporting 1-on-1 and group chats with message history.

Key Components:

Mobile/Web Clients
       ↓
WebSocket Servers (persistent connections)
       ↓
Message Queue (Kafka) → Chat Service → Message DB (Cassandra)
                                     ↓
                              Presence Service → Redis
                                     ↓
                           Notification Service → APNs/FCM

Core APIs:

WebSocket: ws://chat.app/ws?user_id=123
  SEND:    {type:"message", to:456, content:"Hello"}
  RECEIVE: {type:"message", from:456, content:"Hi!"}

REST:
GET  /messages?conversation_id=789&before=timestamp  → [messages]
POST /conversations                                   → {conv_id}

Message Flow:

User A sends message → WebSocket server A receives
Kafka publishes message to messages topic
Chat Service consumes → writes to Cassandra
Chat Service checks if User B connected:
- Yes → route to WebSocket server B
- No → push notification via APNs/FCM

Why Cassandra for Messages?

Write-heavy (every message is a write)
Time-series access pattern (recent messages first)
No complex JOINs needed
Easy horizontal scaling

Database Schema (Cassandra):

messages_by_conversation:
  PRIMARY KEY (conversation_id, created_at DESC)
  columns: message_id, sender_id, content, type, status

Scale Considerations:

WhatsApp: 100B messages/day → ~1.15M messages/sec
One WebSocket server handles ~65K concurrent connections
Need 100M/65K = ~1,540 WebSocket servers for 100M concurrent users
Message fanout for group chats: push to each member's queue

Trade-offs:

Message ordering: Lamport clocks vs server timestamps
End-to-end encryption: Signal Protocol (complex but secure)
Message storage: client-only vs server (backup vs storage cost)

Q3. Design Twitter's News Feed (Timeline) `Hard`

Problem: Users post tweets; followers see a personalized feed of recent tweets.

Two Approaches:

Fan-out on Write (Push):

Tweet Created → Fan-out Service → Write to each follower's feed cache
                                  (Redis sorted set per user)
User reads → Serve from their feed cache (fast!)

Fan-out on Read (Pull):

Tweet stored in tweet DB
User reads → Query all followed users' tweets → Merge/sort → Return feed

Hybrid (Used by Twitter):

Regular users (≤100K followers): Fan-out on write
Celebrities (>1M followers): Fan-out on read (too expensive to push to millions)
Mix feeds at read time

Architecture:

Tweet Service → Kafka → Fan-out Workers → Redis (User Feeds)
                                        ↓
                                Tweet Store (Cassandra)
                                        ↓
                                 User Graph Service (followers)

Feed Storage (Redis):

Key: feed:{user_id}
Type: Sorted Set (ZSET)
Score: timestamp
Value: tweet_id
Keep only latest 1,000 tweets per user

Scale Estimates:

300M MAU, 150M DAU
500M tweets/day → ~5,800 tweets/sec
Average 200 followers → 5,800 × 200 = 1.16M feed writes/sec
Read: 150M users, each check 5 times/day = 8,700 feed reads/sec

Trade-offs:

Write amplification for push (celebrity with 50M followers → 50M writes per tweet)
Read latency for pure pull (merge many feeds at read time)
Cache invalidation: use tweet_ids in feed cache, fetch actual content on demand

Q4. Design a Rate Limiter `Medium`

Problem: Prevent abuse by limiting API calls per user/IP to N requests per time window.

Algorithms:

1. Token Bucket (most popular):

Each user has a bucket with max N tokens
Tokens refill at rate R per second
Each request consumes 1 token
If bucket empty → reject request (429 Too Many Requests)

2. Sliding Window Counter:

Track request timestamps in Redis for last 60 seconds
Count requests in window → if > limit, reject
Uses sorted sets: ZADD + ZCOUNT + ZREMRANGEBYSCORE

3. Fixed Window Counter:

Simple counter per user per minute
Resets at minute boundary
Problem: burst attacks at window boundary (double limit)

Implementation (Redis-based):

-- Redis Lua script (atomic)
local key = "rate_limit:" .. user_id .. ":" .. minute
local current = redis.call("INCR", key)
if current == 1 then
    redis.call("EXPIRE", key, 60)
end
if current > limit then
    return 0  -- reject
end
return 1  -- allow

Architecture:

API Gateway → Rate Limiter Middleware → Redis Cluster
                                     ↓
                              Response Headers:
                              X-RateLimit-Limit: 100
                              X-RateLimit-Remaining: 73
                              X-RateLimit-Reset: 1709901600

Scale Considerations:

Use Redis Cluster for horizontal scaling
Store rate limit rules in a centralized config (easy to update without deploys)
Different limits per tier: free (100/hr), pro (10,000/hr), enterprise (unlimited)

Trade-offs:

Algorithm	Memory	Accuracy	Burst Handling
Token Bucket	Low	Medium	Allows bursts
Fixed Window	Very Low	Low	Window boundary burst
Sliding Window Log	High	High	Smooth
Sliding Window Counter	Low	Good	Good

Q5. Design a File Storage System (like Dropbox/Google Drive) `Hard`

Problem: Users can upload, download, sync, and share files across devices.

Architecture:

Client App (Dropbox) → API Servers → Metadata DB (MySQL)
                     ↓
              Upload Service → Chunk Splitter → Object Storage (S3)
                                              ↓
                                        CDN (CloudFront)

Key Design Decisions:

Chunking (Critical Optimization):

File → Split into 4MB chunks → Upload each chunk → Combine on server
Benefits:
- Resume interrupted uploads (only re-upload failed chunks)
- Deduplication: same chunk across different files → stored once (content-addressed)
- Delta sync: only changed chunks uploaded on edit (saves bandwidth)

Metadata Schema:

files: (file_id, user_id, name, size, checksum, created_at, parent_folder_id)
chunks: (chunk_id, file_id, sequence, checksum, storage_path, size)
file_versions: (version_id, file_id, chunk_list, created_at)
shares: (share_id, file_id, shared_with_user_id, permission, expires_at)

Sync Mechanism:

Client maintains local state DB (SQLite)
On change → compute diff → upload changed chunks
Server sends event via long-polling/WebSocket → client syncs
Use vector clocks or "last write wins" for conflict resolution

Scale:

Dropbox: 700M registered users, 4B files, 1.2B photos
Storage: if average file = 2MB, 4B files = 8PB storage
Deduplication typically saves 40-60% storage

Trade-offs:

Chunk size: smaller = more metadata overhead; larger = less benefit for small files
Conflict resolution: last-write-wins vs manual merge (Google Docs style)
Encryption: client-side (zero-knowledge, Tresorit) vs server-side (Google Drive)

Q6. Design YouTube/Netflix (Video Streaming) `Hard`

Problem: Users upload videos; other users stream them with minimal buffering.

Architecture:

Upload Flow:
User → API Gateway → Upload Service → Raw Video Storage (S3)
                                    ↓
                          Transcoding Pipeline (FFmpeg workers)
                                    ↓
                    Multiple resolutions: 360p, 720p, 1080p, 4K
                                    ↓
                            CDN Edge Nodes (worldwide)

Stream Flow:
User → CDN (nearest edge) → Adaptive Bitrate Streaming (HLS/DASH)

Transcoding:

Split video into segments → transcode in parallel (reduce processing time)
Generate multiple resolutions + bitrates for adaptive streaming
Store as HLS (.m3u8 manifest + .ts segments) or DASH

Database:

Videos: video_id, user_id, title, description, status, duration
Video_metadata: thumbnail, tags, category, language
User_engagement: views, likes, comments, watch_time
Recommendations: ML-based, stored as precomputed lists

CDN Strategy:

Popular videos → pushed to all edge locations
Long-tail content → pulled to edge on first request
Geographic routing via DNS

Scale (YouTube):

500 hours of video uploaded per minute
1B hours watched per day
Need thousands of transcoding workers

Q7. Design a Search Autocomplete System `Medium`

Problem: As a user types, suggest completions in real-time (< 100ms).

Architecture:

User types → Debounced API call → Autocomplete Service
                                         ↓
                               Trie (in-memory) or
                               Prefix Index (Redis Sorted Set)

Trie Approach:

Store popular queries in a trie
Each node stores top-K (e.g., K=5) results by frequency
On prefix lookup: traverse to prefix node → return stored top-K

Build: batch job daily from search logs
Serve: load trie into memory on each autocomplete server
Scale: partition trie by first character

Redis Sorted Set Approach:

Key: autocomplete:prefix
Score: frequency
Member: full_query

ZADD autocomplete:py 1500 "python"
ZADD autocomplete:py 1200 "python tutorial"
ZREVRANGE autocomplete:py 0 4  → top 5 queries

On search: update counters via stream processing

Filtering:

Block offensive/banned terms
Personalize based on user history
A/B test different ranking signals

Trade-offs:

Trie: fast lookup, complex update, memory-intensive
Redis ZSET: easy to update, slightly slower, simpler ops
Elasticsearch: flexible but adds latency vs in-memory

Q8. Design an E-commerce Flash Sale System `Hard`

Problem: Handle 1M+ concurrent users trying to buy limited-quantity items (race conditions, overselling prevention).

Key Challenges:

Overselling: must not sell more than available stock
Performance: 1M users in seconds → DB bottleneck
Fairness: no one gets double-purchased

Architecture:

Users → CDN (static assets) → Load Balancer → Flash Sale Service
                                                      ↓
                                              Redis (stock + queue)
                                                      ↓
                                              Order Service → DB
                                                      ↓
                                              Payment Service

Stock Management (Redis Lua — atomic):

-- Atomic decrement with check
local stock = redis.call('GET', 'item:stock:' .. item_id)
if tonumber(stock) <= 0 then
    return 0  -- sold out
end
redis.call('DECR', 'item:stock:' .. item_id)
redis.call('RPUSH', 'purchase:queue', user_id .. ':' .. item_id)
return 1  -- success

Queue-Based Approach:

User clicks "Buy" → Enter queue (Redis sorted set, score = timestamp)
Queue processor: dequeue users, create orders, process payments
Waiting users: show position in queue (poll endpoint)

Trade-offs:

Lua script atomicity vs distributed lock overhead
Queue fairness vs first-come-first-served
Pre-warming cache before sale begins

Q9. Design a Notification System `Medium`

Problem: Send push, email, SMS, and in-app notifications to millions of users at scale.

Architecture:

Event Sources → Notification Service → Priority Queue (Kafka)
                                             ↓
                              Channel Routers:
                              ├── Push: APNs / FCM
                              ├── Email: SendGrid / SES
                              ├── SMS: Twilio
                              └── In-App: WebSocket / DB
                                             ↓
                                    Delivery Tracker
                                    (retry, dedup, analytics)

Key Features:

Deduplication: use idempotency keys to prevent duplicate sends
Rate limiting: don't overwhelm users; respect quiet hours
Priority: OTP (critical) > transaction alerts (high) > marketing (low)
Retry with backoff: exponential retry for failed deliveries

Database:

notifications: (id, user_id, type, title, body, data, sent_at, read_at, channel)
user_preferences: (user_id, channel, category, enabled, quiet_hours)
delivery_log: (notification_id, channel, status, attempt_count, last_attempt)

Q10. Design Uber's Ride Matching System `Hard`

Problem: Match riders with nearby drivers in real-time.

Architecture:

Driver App → Location Service → Redis Geo (store driver locations)
Rider App  → Matching Service → Query nearby drivers → Rank → Assign
                              ↓
                      Surge Pricing Service
                              ↓
                      Trip Service → DB

Location Storage:

Redis GEOADD drivers 72.8777 19.0760 driver_123
-- Find drivers within 5km of rider
GEORADIUS drivers 72.88 19.07 5 km WITHCOORD COUNT 20

Matching Algorithm:

Find all available drivers within radius
Rank by ETA (estimated time of arrival), driver rating, acceptance rate
Offer trip to best match → wait for acceptance (timeout → next driver)
Confirmed → create trip record, start tracking

Surge Pricing:

Grid city into hexagonal cells (H3 library)
Track demand/supply ratio per cell
Price multiplier = f(demand/supply)

Scale:

5M trips/day → ~58 trips/sec
5M active drivers → ~5M location updates/sec (update every 5s)
Redis Geo supports real-time geospatial queries efficiently

Data & Storage Systems

Q11. Design a Key-Value Store (like Redis) `Hard`

Components: In-memory hash table + disk persistence (AOF logs / RDB snapshots) + replication (primary-replica) + clustering (consistent hashing for sharding).

Trade-offs: Memory-only (fast, limited by RAM) vs disk-backed (slower, larger capacity) vs hybrid (hot data in memory, cold on disk — RocksDB approach).

Q12. Design a Distributed Cache `Medium`

Cache Eviction Policies: LRU (most common), LFU, TTL-based, Random.

Cache Invalidation Strategies:

TTL: expire after time (simple, may serve stale data)
Write-through: update cache on every write (consistent, write overhead)
Write-behind: async DB write (fast writes, risk of data loss)
Cache-aside: app manages cache manually (most flexible)

Distributed Considerations: Consistent hashing for sharding, replication for availability, hot-spot mitigation (virtual nodes).

Q13. Design a Distributed Message Queue (like Kafka) `Hard`

Core Concepts:

Topics: logical channels; Partitions: horizontal scaling unit; Consumer Groups: parallel processing
Offset: position within partition; consumers commit offsets
Retention: messages kept for N days (re-processable)

When to Use Kafka: Async communication between services, event sourcing, real-time analytics, decoupling producers from consumers.

Q14. Design a Time-Series Database `Medium`

Use Cases: IoT metrics, application monitoring (CPU, memory), stock prices.

Key Requirements: High write throughput, time-range queries, downsampling (aggregate old data to save space), TTL for auto-deletion.

Design: Partition by time (hourly/daily chunks), columnar storage for compression, pre-aggregated rollups (1min → 5min → 1hour).

Q15. Design a Search Engine (like Elasticsearch) `Hard`

Key Concepts:

Inverted Index: word → list of document IDs containing that word
TF-IDF: ranking by term frequency and inverse document frequency
Sharding: distribute index across nodes
Replication: each shard has primary + replica(s)

Write Flow: Document → tokenize/analyze → update inverted index Query Flow: Parse query → identify relevant shards → gather scores → merge/rank → return top-K

Q16. Design a Recommendation System `Hard`

Approaches:

Collaborative Filtering: users with similar behavior get similar recommendations (Matrix Factorization)
Content-Based: recommend similar items based on features
Hybrid: combine both (Netflix approach)

Architecture: Offline training (spark/GPU cluster) → model store → online serving (low latency inference) → A/B testing framework.

Q17. Design a Distributed Counter (like Reddit Votes) `Medium`

Challenge: High write throughput for popular posts (millions of votes/second).

Solutions:

Redis INCR: atomic, in-memory, fast — but single point if not clustered
Approximate counting: HyperLogLog for cardinality, CRDT for distributed counting
Write batching: buffer counts locally, flush to DB periodically

Q18. Design a Photo Sharing App (like Instagram) `Medium`

Key Systems:

Upload: chunked upload → S3 → async processing (resize, filter, thumbnail)
Feed: hybrid fan-out (celebrity accounts pull, regular push)
Discovery: hashtag index, explore page ML model
Stories: 24-hour TTL, sorted by recency

Infrastructure & Reliability

Q19. How would you design for High Availability? `Medium`

Key Strategies:

No single point of failure: redundant servers, multi-AZ deployment
Health checks + Auto-restart: load balancer health probes, process supervisors
Circuit breaker: fail fast if downstream service is down
Graceful degradation: serve stale data vs hard failure
Chaos engineering: intentionally inject failures to find weaknesses

SLA Math: 99.9% ("three nines") = 8.7 hours downtime/year; 99.99% ("four nines") = 52 minutes/year

Q20. How would you design for Scalability? `Medium`

Horizontal vs Vertical:

Vertical: add more CPU/RAM to one server (limited, expensive, single point)
Horizontal: add more servers (preferred, requires stateless services)

Techniques:

Stateless services → easy horizontal scaling
Caching at multiple layers (CDN, reverse proxy, app, DB)
Database sharding (horizontal partitioning)
Read replicas for read-heavy workloads
Async processing via queues for spiky workloads

Q21. Explain CAP Theorem and its implications `Hard`

CAP Theorem: A distributed system can only guarantee 2 of 3:

Consistency: all nodes see the same data at the same time
Availability: every request gets a response (not necessarily latest data)
Partition Tolerance: system continues operating despite network partitions

Real-world: Since network partitions WILL happen, choose CP or AP:

CP (consistent, partition-tolerant): ZooKeeper, HBase, MongoDB (strong consistency mode)
AP (available, partition-tolerant): Cassandra, DynamoDB, CouchDB

Design Implication: For banking → CP (never show wrong balance). For social media feed → AP (slightly stale feed is acceptable).

Q22. Design a Load Balancer `Medium`

Algorithms:

Round Robin: simple, equal distribution (bad for unequal server sizes)
Weighted Round Robin: distribute based on server capacity
Least Connections: route to server with fewest active connections
IP Hash: same client always hits same server (session stickiness)
Consistent Hashing: minimize reshuffling when adding/removing servers

Layer 4 vs Layer 7:

L4 (Transport): faster, routes based on IP/TCP without inspecting content
L7 (Application): smarter, routes based on HTTP headers, URL path, cookies

Q23. What is a CDN and how does it work? `Easy`

CDN (Content Delivery Network): Network of geographically distributed servers (PoPs — Points of Presence) that cache content closer to users.

Flow: User request → DNS resolves to nearest CDN PoP → CDN serves from cache OR fetches from origin and caches.

What to CDN: Static assets (JS, CSS, images, videos), API responses with appropriate cache headers.

Cache Control:

Cache-Control: public, max-age=86400   (cache 1 day)
Cache-Control: no-cache                (always validate with server)
Cache-Control: private, max-age=3600   (browser only, not CDN)

Q24. Explain Database Sharding `Hard`

Sharding: Horizontally partition data across multiple database servers (shards). Each shard holds a subset of the data.

Sharding Strategies:

Range-based: user IDs 1-1M → Shard 1, 1M-2M → Shard 2 (hotspot risk for new users)
Hash-based: shard = hash(user_id) % num_shards (even distribution, hard to range query)
Directory-based: lookup table maps keys to shards (flexible, lookup table becomes bottleneck)

Challenges: Cross-shard joins (avoid or use scatter-gather), rebalancing shards (consistent hashing helps), distributed transactions (expensive).

Q25. Design a Circuit Breaker `Medium`

States:

Closed: normal operation, requests pass through
Open: too many failures → stop all requests immediately (fail fast)
Half-Open: after timeout, allow limited requests to test if service recovered

Closed → (failure threshold exceeded) → Open
Open → (after reset timeout) → Half-Open
Half-Open → (success) → Closed | (failure) → Open

Implementation: Track failure count and timestamps in a sliding window. Trip breaker at X failures in Y seconds.

Advanced & Specialized

Q26. Design a Distributed ID Generator (like Snowflake) `Hard`

Requirements: Globally unique, sortable by time, no central bottleneck.

Twitter Snowflake Format (64-bit):

[41 bits timestamp] [10 bits machine ID] [12 bits sequence]
 ~69 years           1024 machines         4096 IDs/ms

Alternatives:

UUID v4: random, not sortable (bad for DB index performance)
UUID v7: time-ordered UUID (good modern alternative)
ULID: sortable, URL-safe alternative to UUID

Q27. Design a Pastebin (text sharing) `Easy`

Simple design: Short unique ID (Base58) → map to stored text. S3 for content storage, DB for metadata (title, expiry, view count, visibility). Use CDN for popular pastes, TTL for expiration.

Added features: Syntax highlighting (stored as metadata, applied client-side), private pastes (require auth), burn-after-read (delete on first view).

Q28. Design a Real-time Collaborative Editor (like Google Docs) `Hard`

Challenge: Multiple users editing simultaneously without conflicts.

Approaches:

Operational Transformation (OT): track and transform operations to handle concurrency (complex, used by Google Docs)
CRDT (Conflict-free Replicated Data Types): data structures that merge automatically without conflicts (used by Figma)

Architecture:

Client → WebSocket → Collaboration Server → Operation Log (Kafka)
                                         → Document Store
All clients receive all operations → apply transformations locally

Q29. Design an API Gateway `Medium`

Functions: Authentication/Authorization, Rate Limiting, Request routing, SSL termination, Load balancing, Request/Response transformation, Analytics & Logging, Caching.

Architecture:

Clients → API Gateway → [Auth Service] → Microservices
                      → [Rate Limiter]
                      → [Request Router]
                      → [Response Cache]

Examples: AWS API Gateway, Kong, NGINX, Envoy.

Q30. Design a Fraud Detection System `Hard`

Real-time Requirements: Decisions in < 100ms per transaction.

Architecture:

Transaction Event → Stream Processor (Kafka/Flink) → Feature Extraction
                                                   ↓
                                           ML Model (rule engine + gradient boosting)
                                                   ↓
                                           Risk Score → Block/Allow/Challenge
                                                   ↓
                                           Feedback Loop (labeled fraud → retrain)

Features Used: Transaction amount, velocity (N txns/hour), location mismatch, device fingerprint, user behavior patterns, merchant category, time-of-day.

Trade-offs:

False positive rate vs false negative rate (blocking legitimate users vs letting fraud through)
Latency vs model complexity
Rules engine (transparent, fast to update) vs ML (higher accuracy, black box)

System Design Cheat Sheet

Back-of-Envelope Numbers

Unit	Value
1KB	10^3 bytes
1MB	10^6 bytes
1GB	10^9 bytes
1TB	10^12 bytes
1 million req/day	~12 req/sec
1 billion req/day	~12,000 req/sec

Common Technology Choices

Use Case	Technology
Real-time messaging	WebSockets, Server-Sent Events
Message queue	Kafka (stream), RabbitMQ (task queue)
Cache	Redis (versatile), Memcached (simple caching)
Search	Elasticsearch, Solr
File storage	S3, GCS
Time-series	InfluxDB, TimescaleDB
Graph DB	Neo4j, Amazon Neptune
Wide-column	Cassandra, HBase

Data Structures Interview Questions 2026 — DSA is the prerequisite for every system design round
Networking Interview Questions 2026 — TCP/IP, HTTP, and DNS — the backbone of distributed systems
DBMS Interview Questions 2026 — Database design decisions are central to system design
Amazon Placement Papers 2026 — System design is heavily tested at Amazon SDE rounds

Top 30 System Design Interview Questions for 2026

Top 30 System Design Interview Questions for 2026

How to Approach System Design Questions

Table of Contents

Classic Design Problems

Q1. Design a URL Shortener (like bit.ly) Medium

Q2. Design a Chat Application (like WhatsApp) Hard

Q3. Design Twitter's News Feed (Timeline) Hard

Q4. Design a Rate Limiter Medium

Q5. Design a File Storage System (like Dropbox/Google Drive) Hard

Q6. Design YouTube/Netflix (Video Streaming) Hard

Q7. Design a Search Autocomplete System Medium

Q8. Design an E-commerce Flash Sale System Hard

Q9. Design a Notification System Medium

Q10. Design Uber's Ride Matching System Hard

Data & Storage Systems

Q11. Design a Key-Value Store (like Redis) Hard

Q12. Design a Distributed Cache Medium

Q13. Design a Distributed Message Queue (like Kafka) Hard

Q14. Design a Time-Series Database Medium

Q15. Design a Search Engine (like Elasticsearch) Hard

Q16. Design a Recommendation System Hard

Q17. Design a Distributed Counter (like Reddit Votes) Medium

Q18. Design a Photo Sharing App (like Instagram) Medium

Infrastructure & Reliability

Q19. How would you design for High Availability? Medium

Q20. How would you design for Scalability? Medium

Q21. Explain CAP Theorem and its implications Hard

Q22. Design a Load Balancer Medium

Q23. What is a CDN and how does it work? Easy

Q24. Explain Database Sharding Hard

Q25. Design a Circuit Breaker Medium

Advanced & Specialized

Q26. Design a Distributed ID Generator (like Snowflake) Hard

Q27. Design a Pastebin (text sharing) Easy

Q28. Design a Real-time Collaborative Editor (like Google Docs) Hard

Q29. Design an API Gateway Medium

Q30. Design a Fraud Detection System Hard

System Design Cheat Sheet

Back-of-Envelope Numbers

Common Technology Choices

Related Articles

More resources in interview-questions

More in interview-questions

Top 30 HR Interview Questions with Best Answers (2026)

Top 40 React.js Interview Questions & Answers (2026)

Top 50 Data Structures Interview Questions 2026

Top 50 Java Interview Questions for Freshers 2026

More from PapersAdda

Share this article:

Q1. Design a URL Shortener (like bit.ly) `Medium`

Q2. Design a Chat Application (like WhatsApp) `Hard`

Q3. Design Twitter's News Feed (Timeline) `Hard`

Q4. Design a Rate Limiter `Medium`

Q5. Design a File Storage System (like Dropbox/Google Drive) `Hard`

Q6. Design YouTube/Netflix (Video Streaming) `Hard`

Q7. Design a Search Autocomplete System `Medium`

Q8. Design an E-commerce Flash Sale System `Hard`

Q9. Design a Notification System `Medium`

Q10. Design Uber's Ride Matching System `Hard`

Q11. Design a Key-Value Store (like Redis) `Hard`

Q12. Design a Distributed Cache `Medium`

Q13. Design a Distributed Message Queue (like Kafka) `Hard`

Q14. Design a Time-Series Database `Medium`

Q15. Design a Search Engine (like Elasticsearch) `Hard`

Q16. Design a Recommendation System `Hard`

Q17. Design a Distributed Counter (like Reddit Votes) `Medium`

Q18. Design a Photo Sharing App (like Instagram) `Medium`

Q19. How would you design for High Availability? `Medium`

Q20. How would you design for Scalability? `Medium`

Q21. Explain CAP Theorem and its implications `Hard`

Q22. Design a Load Balancer `Medium`

Q23. What is a CDN and how does it work? `Easy`

Q24. Explain Database Sharding `Hard`

Q25. Design a Circuit Breaker `Medium`

Q26. Design a Distributed ID Generator (like Snowflake) `Hard`

Q27. Design a Pastebin (text sharing) `Easy`

Q28. Design a Real-time Collaborative Editor (like Google Docs) `Hard`

Q29. Design an API Gateway `Medium`

Q30. Design a Fraud Detection System `Hard`