issue 117apr 27mmxxvi
est. 2017
Sun, 27 Apr 2026
vol. IX · no. 117
PapersAdda
placement intelligence, since 2017
640+ briefs · 24 campuses · by reservation
verified offers · sourced from r/developersIndia
razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1

System Design: Chat Application 2026 [WhatsApp/Slack Architecture]

11 min read
Government Exams
Updated: 8 Jun 2026
Aditya Sharma
Aditya's Edit

PapersAdda 2026 Placement Cycle

By Aditya Sharma·Founder & Editor, PapersAdda

What changed in 2026 drives

Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.

What I'd actually study for this

  • 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
  • 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
  • 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
  • 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken

Where most candidates trip up

The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.

Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.

Last Updated: June 2026


Why Chat Is a Canonical System Design Problem

Candidates report chat application design in roughly 15-20% of FAANG system design rounds. Based on public preparation resources and candidate-reported interview threads, it tests real-time communication patterns, connection management, message persistence, and push notifications all at once.


Step 1: Requirements

Functional requirements:

  • One-on-one messaging and group chats (up to 500 members)
  • Real-time message delivery
  • Message status: sent, delivered, read
  • Media sharing: images, videos, documents
  • Online presence: online/offline/last seen
  • Push notifications for offline users
  • Message history stored and retrievable

Non-functional requirements:

  • Low latency: message delivery under 200ms for online users
  • High availability: 99.99% uptime
  • Scale: 500 million active users, 100 billion messages per day
  • Durability: no message loss

Step 2: Capacity Estimation

Active users: 500M
Daily messages: 100B
Messages per second: 100B / 86,400 = ~1.16M messages/sec
Average message size: 100 bytes (text) + metadata
Text message storage per day: 100B * 100B = 10TB/day

WebSocket connections:
  500M DAU, assume 20% online at peak = 100M concurrent connections
  Each connection: ~10KB memory on server
  100M * 10KB = 1TB memory -> needs ~1000 chat servers (1GB each)

Step 2b: API and Protocol Design

Chat uses two channels: a persistent WebSocket for real-time bidirectional messaging, and a REST API for everything that does not need a live connection (history fetch, media URLs, conversation list).

WebSocket frames (real-time):
  -> SEND     { conversation_id, content, client_msg_id }
  <- ACK      { client_msg_id, message_id, status: "sent" }
  <- MESSAGE  { conversation_id, message_id, sender_id, content, ts }
  <- RECEIPT  { message_id, status: "delivered" | "read" }
  <- PRESENCE { user_id, status: "online" | "offline", last_seen }

REST API (request/response):
  GET  /v1/conversations?cursor=...        list user's chats
  GET  /v1/conversations/{id}/messages?before={message_id}&limit=50
  POST /v1/media/upload-url                 get a pre-signed S3 URL
  POST /v1/conversations                    create a group

The client_msg_id field is essential and frequently missed. The client generates it before sending so that, on a flaky network where the client retries, the server can deduplicate: if it has already persisted a message with that client_msg_id, it returns the existing message_id instead of creating a duplicate. Without this, a single tap can produce two messages when the ACK is lost and the client retries.


Step 3: Core Components

[Mobile/Web Clients]
       |
       | WebSocket (persistent)
       |
[Chat Servers / Connection Layer]
       |
   [Message Queue (Kafka)]
       |
   [Message Service]
       |
  [Message DB (Cassandra)]
       |
  [Notification Service] ----> [APNs / FCM]
       |
  [Media Service] <---------> [Object Storage (S3)]
       |
  [Presence Service] -------> [Redis (online status)]

Step 4: WebSocket Connection Management

# Simplified WebSocket server pseudocode

class ConnectionManager:
    """
    Maps user_id to their WebSocket connection.
    In production: distributed via Redis pub/sub to handle
    users connected to different servers.
    """
    def __init__(self):
        self.connections = {}  # user_id -> websocket

    async def connect(self, user_id, websocket):
        self.connections[user_id] = websocket
        await self.notify_presence(user_id, online=True)

    async def disconnect(self, user_id):
        self.connections.pop(user_id, None)
        await self.notify_presence(user_id, online=False)

    async def send_message(self, recipient_id, message):
        ws = self.connections.get(recipient_id)
        if ws:
            await ws.send_json(message)
            return True
        return False  # user is offline

Cross-server delivery: When users A (on Server 1) and B (on Server 2) chat:

  1. A sends message to Server 1
  2. Server 1 writes to Kafka topic messages
  3. Server 2 (subscribed to B's topic) receives from Kafka
  4. Server 2 delivers to B via B's WebSocket connection

Alternatively, use Redis Pub/Sub: each server subscribes to channels for its connected users.


Step 5: Message Flow

One-on-one message flow:
1. Sender (A) sends message over WebSocket
2. Server assigns message_id (Snowflake), timestamp
3. Server writes to Kafka (async, fast)
4. Server ACKs to A: "message sent" (single tick)
5. Kafka consumer writes to Cassandra
6. Server finds B's connection server via user_server_map (Redis)
7. Server delivers to B's server via internal gRPC
8. B's server delivers to B over WebSocket
9. B ACKs: "delivered" (double tick)
10. A receives delivered receipt

Read receipt:
  B opens the message -> B sends READ event to server
  Server notifies A -> A sees double blue ticks

Step 6: Database Schema

-- Cassandra schema (wide column)

-- Messages table
CREATE TABLE messages (
    conversation_id UUID,
    message_id      TIMEUUID,          -- time-sortable UUID
    sender_id       UUID,
    content         TEXT,
    media_url       TEXT,
    message_type    TEXT,              -- text, image, video, file
    status          TEXT,              -- sent, delivered, read
    PRIMARY KEY (conversation_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);  -- newest first

-- Conversation membership
CREATE TABLE conversation_members (
    conversation_id UUID,
    user_id         UUID,
    joined_at       TIMESTAMP,
    last_read_id    TIMEUUID,
    PRIMARY KEY (user_id, conversation_id)
);

-- User conversations index
CREATE TABLE user_conversations (
    user_id             UUID,
    conversation_id     UUID,
    last_message_time   TIMESTAMP,
    last_message        TEXT,
    unread_count        INT,
    PRIMARY KEY (user_id, last_message_time)
) WITH CLUSTERING ORDER BY (last_message_time DESC);

Why Cassandra?

  • Write-heavy workload (100B writes/day)
  • Natural (conversation_id, message_id) partition = O(1) write + O(1) range read per conversation
  • Linear horizontal scaling
  • Tunable consistency (writes with ONE, reads with QUORUM for balance)

Step 7: Presence Service

User online status:
  Login: SET user:{id}:status "online" EX 300 (Redis with 5-min TTL)
  Heartbeat: client sends ping every 30 seconds, extends TTL
  Logout: DEL user:{id}:status
  Timeout: TTL expires -> user appears offline

"Last seen" timestamp:
  On disconnect: SET user:{id}:last_seen {timestamp}

Scaling presence:
  Single Redis handles ~10M keys easily
  For 500M users: Redis Cluster with ~10 shards
  Read: O(1) GET per user

Step 8: Push Notifications for Offline Users

Message arrives for offline user B:
1. WebSocket delivery fails (B is not connected)
2. Message Service publishes to Notification Queue (Kafka)
3. Notification Worker reads event
4. Check B's notification preferences (DB lookup)
5. Route to APNs (iOS) or FCM (Android) or Web Push
6. Push "You have a new message" with payload
7. B opens app -> fetches missed messages via REST API

Rate limiting notifications:
  Batch rapid messages: send one notification for 5+ messages in 30 seconds
  Respect "Do Not Disturb" preferences in user settings

Step 9: Media Handling

Sending an image:
1. Client uploads image directly to S3 (pre-signed URL from server)
2. Client gets S3 URL
3. Client sends message with media_url = S3 URL
4. Recipients download media from S3 (CDN-cached)

Pre-signed URL flow:
  POST /media/upload-url
  -> Server generates S3 pre-signed URL (valid 15 min)
  -> Client uploads directly to S3 (bypasses app servers)
  -> Client sends message_id + s3_url
  -> No media goes through chat server (saves bandwidth)

Step 10: Scaling and Tradeoffs

ComponentScaling approachTradeoff
Chat serversHorizontal, stateless (Redis for connection map)Redis becomes hotspot at extreme scale
Message storageCassandra cluster, partition by conversationCross-conversation queries are expensive
PresenceRedis cluster, sharded by user_idMinor stale presence under failure
Message queueKafka, partitioned by conversation_idOrdering guaranteed per partition
MediaS3 + CloudFront CDNStorage cost scales linearly

Group Chat Specifics

Group chat (up to 500 members):

Message fanout problem:
  When A sends to a 500-member group,
  the server must deliver to 500 connections.

Option 1: Push model
  Server writes message to each member's inbox.
  500 writes per message. Simple, but expensive for large groups.

Option 2: Pull model
  Server writes to single group inbox.
  Each member polls for new messages.
  Lower write amplification, higher read load.

Option 3: Hybrid (WhatsApp approach)
  Small groups (< 100): push to each member's server
  Large groups (100-500): members pull from group timeline

The Hard Problems in Chat at Scale

Message ordering is the most underestimated challenge. When two users send messages simultaneously in a group chat, the server must linearize them. The standard approach is to assign a monotonically increasing sequence number at the conversation level using a distributed counter (Redis INCR or database sequence). Clients display messages sorted by sequence number, not by local clock, because client clocks drift.

The "last seen" problem is another commonly asked follow-up. Updating last_seen for every message read would create enormous write traffic for highly active users. The production solution is to batch updates: update last_seen in Redis on every read event, and flush to the database every 30 seconds per user. This reduces database writes by roughly 30x.

Message search is a separate system entirely. Full-text search over message history requires Elasticsearch or a similar inverted-index structure. Most chat applications offer search only for recent messages (last 90 days) to bound the index size. The interview scoping question you should ask is whether search is in-scope before designing it.

Why WebSocket Over HTTP Polling

HTTP long-polling was the predecessor to WebSocket. The client makes a request, the server holds it open until a message arrives, then the client immediately makes another request. This creates roughly one HTTP connection per message, with TCP handshake overhead on each. WebSocket is a single persistent TCP connection upgraded from HTTP, reducing per-message overhead to essentially zero. For a chat application with millions of concurrent users, the difference in server load is significant.

HTTP/2 Server-Sent Events are a middle ground: unidirectional push from server to client, simpler than WebSocket but insufficient for two-way communication without a separate request channel for sending.


Failure Handling and Delivery Guarantees

Delivery guarantees are where this design earns or loses senior-level credit. The system targets at-least-once delivery with client-side deduplication, not exactly-once, because exactly-once across a network is impractical.

Sender's network drops after SEND but before ACK:
  Client retries SEND with the same client_msg_id.
  Server deduplicates on client_msg_id and returns the original
  message_id. No duplicate is persisted or delivered.

Recipient is offline when the message is written:
  Message is persisted to Cassandra regardless of delivery.
  On reconnect, the client pulls all messages after its
  last_synced message_id. Persistence is the source of truth;
  WebSocket delivery is best-effort on top of it.

Chat server crashes with live connections:
  Connections drop; clients auto-reconnect to another server
  (load balancer reassigns). The connection map in Redis is
  updated on reconnect. Undelivered messages are pulled on sync.

Kafka consumer lag during a spike:
  Writes to Cassandra fall behind but never drop, because Kafka
  buffers. Real-time delivery degrades to "pull on next sync"
  during the spike, which is acceptable.

The principle to articulate: persist first, deliver second. Because every message is durably written before delivery is attempted, no message is ever lost even if every real-time delivery path fails. The client reconciles by syncing from its last known message_id.


Follow-up Questions Interviewers Ask

How do you guarantee message ordering in a group chat? Assign a per-conversation monotonic sequence number (Redis INCR on conversation_id or a Cassandra-friendly TIMEUUID). Clients sort by sequence, never by local wall-clock, because device clocks drift. Within a Kafka partition keyed by conversation_id, ordering is preserved end to end.

How does a user with multiple devices stay in sync? Treat each device as a separate connection but share one server-side message store keyed by user. Each device tracks its own last_synced message_id and pulls the delta on connect. A read receipt from one device propagates to the others through the same RECEIPT frames.

How do you implement end-to-end encryption without breaking server features? With E2E encryption, the server stores ciphertext and cannot read content, so server-side search and rich notification previews are lost. The Signal protocol handles key exchange per conversation. The tradeoff to state: E2E gives privacy at the cost of server-side features like full-text search and content-rich push notifications.

How do you handle a group with the maximum 500 members all active at once? Use the hybrid fan-out: members on the same chat server share a single delivery, and cross-server delivery goes through Kafka or Redis pub/sub keyed by conversation_id. The 500-member cap exists precisely to bound the worst-case fan-out per message.

How do you store and retrieve "last seen" without hammering the database? Update last_seen in Redis on every read, and flush to the durable store every 30 seconds per user. This batches roughly 30x fewer database writes while keeping the displayed value fresh enough for users.


Methodology applied to this articlelast verified 8 Jun 2026
Sources used
Public exam-pattern documents, official recruiter pages, and verified candidate reports on r/developersIndia and LinkedIn.
Verification window
Page last edited 8 Jun 2026 by Aditya Sharma. Numbers and patterns sanity-checked against the most recent 2026 cycle drives we tracked.
What we did NOT do
  • No fabricated salary numbers or success rates. If we quote a range, it's sourced.
  • No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
  • No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

Explore this topic cluster

More resources in Government Exams

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Paid contributor programme

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.

Submit your story →

Ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start Free Mock Test →

More from PapersAdda

Share this guide: