System Design: TinyURL 2026 [Hash Collision, Vanity URLs, QR Codes]

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
Last Updated: June 2026
Why TinyURL Design Tests Different Skills Than Generic URL Shortener
Candidates report TinyURL-specific system design in roughly 10-15% of FAANG rounds, typically at the L4-L5 level. Based on public preparation resources and candidate-reported interview threads, interviewers use TinyURL to probe random ID generation with collision handling (versus sequential IDs), vanity URL reservations, QR code pipeline design, and bulk API patterns -- none of which appear in a generic URL shortener design. If you default to "base62 of auto-increment ID," you will miss the core TinyURL-specific design challenge.
The essential distinction: TinyURL generates random codes from the start. This means no centralized counter, no predictable code space, but also no guaranteed uniqueness without a database existence check on every creation.
Step 1: Requirements
Functional requirements:
- Shorten a long URL to a random 6-character short code (tinyurl.com/abc123)
- Redirect short URL to original URL with low latency
- Custom vanity alias: user can request a specific short code (e.g., tinyurl.com/myshop)
- QR code generation for each short URL
- Bulk API: shorten up to 1000 URLs in a single batch request
- URL expiry: optional TTL on short URLs
- Click analytics: total clicks, country, referrer, device
Non-functional requirements:
- Scale: 50M URLs shortened per day, 5B redirects per day
- Redirect latency: under 10ms for cached, under 80ms for cold
- Availability: 99.99% (under 1 hour downtime/year)
- QR code delivery: under 200ms for single URL creation
Out of scope (for this design):
- Link-in-bio pages (a separate product feature)
- Fraud/phishing URL detection (separate SafeBrowsing integration layer)
Step 2: Capacity Estimation
URL creation QPS:
50M URLs/day / 86,400 = ~580 creates/sec
Redirect QPS:
5B redirects/day / 86,400 = ~57,870 reads/sec = ~58K/sec
Read-to-write ratio: 100:1
Collision analysis for random base62 codes:
6-char base62 = 62^6 = 56.8 billion possible codes
With 50M URLs/day, after 5 years = 91.25B URLs
Collision probability at 50% fill (28.4B URLs): ~1.5 per 1000 creates
Solution: use 7-char base62 = 62^7 = 3.52 trillion codes
At 5-year 91.25B URL mark: collision probability drops to ~0.003 per 1000 creates
Acceptable. Use 7 chars.
Storage per URL record:
short_code: 7 bytes
long_url: ~200 bytes avg
qr_s3_key: ~100 bytes
metadata (user, timestamps, ttl, is_vanity): ~80 bytes
Total: ~387 bytes per record
Storage for 5 years:
91.25B * 387 bytes = ~35.3TB (sharded across 4+ MySQL shards)
Cache (Redis):
Hot 20% of URLs serve 80% of traffic
Cache 20% of daily URLs: 10M records/day * 387 bytes = ~3.9GB/day
Redis cluster with 50GB capacity covers ~12 days of hot URLs
Step 3: Random Code Generation with Collision Handling
This is the defining challenge of TinyURL's architecture. A random 7-char base62 code is generated per URL creation request. Unlike sequential ID approaches, you cannot know in advance whether the code exists.
Three-Layer Collision Defense
Layer 1: Bloom filter (precheck before DB lookup)
from pybloom_live import ScalableBloomFilter
import secrets
import string
CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
CODE_LENGTH = 7
# Bloom filter: ~500MB for 10B elements at 0.1% false positive rate
# Stored in Redis (BITFIELD) or local to each app server with periodic sync
bloom = ScalableBloomFilter(mode=ScalableBloomFilter.LARGE_SET_GROWTH,
error_rate=0.001)
def generate_short_code():
"""
Generate a random 7-char base62 code.
Returns a code not found in the Bloom filter.
"""
for attempt in range(5):
code = ''.join(secrets.choice(CHARS) for _ in range(CODE_LENGTH))
if code not in bloom:
return code, attempt
# Fallback: entropy-seeded code from UUID
import uuid
seed = uuid.uuid4().hex[:14]
return seed[:CODE_LENGTH], 5
Layer 2: DB uniqueness constraint (catch Bloom false positives)
CREATE TABLE url_mappings (
short_code CHAR(7) NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
is_vanity BOOLEAN DEFAULT FALSE,
qr_s3_key VARCHAR(256),
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
is_active BOOLEAN DEFAULT TRUE,
shard_key TINYINT GENERATED ALWAYS AS (CRC32(short_code) % 4) STORED,
PRIMARY KEY (shard_key, short_code),
INDEX (user_id, created_at DESC),
INDEX (expires_at)
);
PRIMARY KEY (shard_key, short_code) enforces uniqueness at the DB level. If the Bloom filter has a false positive and the code is actually free, the INSERT succeeds. If there is a true collision (rare), the INSERT fails with a duplicate key error, and the application retries.
Layer 3: Retry with counter suffix
def create_short_url(long_url, user_id=None, vanity_alias=None, ttl_days=None):
"""
Full URL creation flow with collision handling.
Returns: short_code or raises MaxCollisionError
"""
if vanity_alias:
return _create_vanity(vanity_alias, long_url, user_id, ttl_days)
for retry in range(3):
code, bloom_attempts = generate_short_code()
try:
db.execute("""
INSERT INTO url_mappings (short_code, long_url, user_id, expires_at)
VALUES (%s, %s, %s, %s)
""", (code, long_url, user_id,
_expires(ttl_days) if ttl_days else None))
bloom.add(code) # update bloom filter after successful insert
_warm_cache(code, long_url)
_dispatch_qr_generation(code)
return code
except DuplicateKeyError:
# True collision - rare at 7-char base62 fill rates
continue
raise MaxCollisionError(f"Failed after 3 collision retries for {long_url}")
The retry probability at current scale is under 0.003% per request, making three retries sufficient for 99.999%+ success rate.
Step 4: Vanity URL Reservations
Vanity URLs require a different flow because the user specifies the code. Two problems arise:
- The requested alias may already be taken (by another user or a reserved word)
- Reserved words (api, admin, login, static, qr, bulk, health) must be blocked
RESERVED_ALIASES = frozenset({
"api", "admin", "login", "logout", "static", "qr",
"bulk", "health", "metrics", "dashboard", "settings",
"help", "support", "about", "terms", "privacy"
})
def _create_vanity(alias, long_url, user_id, ttl_days):
"""
Create a custom vanity short URL.
Raises VanityTakenError if alias is reserved or taken.
"""
alias = alias.lower()[:7] # normalize, cap at 7 chars
if alias in RESERVED_ALIASES:
raise ReservedAliasError(f"'{alias}' is a reserved system alias")
if not re.match(r'^[a-zA-Z0-9_-]{3,7}$', alias):
raise InvalidAliasError("Alias must be 3-7 alphanumeric/hyphen/underscore chars")
try:
db.execute("""
INSERT INTO url_mappings (short_code, long_url, user_id, is_vanity, expires_at)
VALUES (%s, %s, %s, TRUE, %s)
""", (alias, long_url, user_id, _expires(ttl_days) if ttl_days else None))
bloom.add(alias)
_warm_cache(alias, long_url)
_dispatch_qr_generation(alias)
return alias
except DuplicateKeyError:
raise VanityTakenError(f"Alias '{alias}' is already taken")
For pro/enterprise users, offer an alias availability check endpoint before creation:
GET /api/check-alias?alias=myshop
Response: {"available": true} or {"available": false, "reason": "taken"}
Step 5: QR Code Generation Pipeline
QR code generation is CPU-bound but fast for a single URL (under 50ms). The design decision is synchronous vs asynchronous generation.
Single URL creation (sync generation):
1. Create URL record in DB
2. Generate QR code PNG in memory (python-qrcode library: ~30-50ms)
3. Upload PNG to S3: s3://tinyurl-qr/{shard}/{short_code}.png
4. Store S3 key in url_mappings.qr_s3_key
5. Return to user with both short_url and qr_code_url
Bulk creation (async generation):
1. Create all URL records in DB (batch INSERT)
2. Dispatch Kafka event per URL: topic "qr-generate"
3. Return job_id for polling
4. QR worker consumes events, generates + uploads to S3
5. Client polls GET /api/bulk/{job_id}/status
For the redirect service, the QR code URL is served via CloudFront CDN directly from S3. No application server involvement after initial generation.
import qrcode
from io import BytesIO
import boto3
s3_client = boto3.client('s3')
QR_BUCKET = 'tinyurl-qr'
def generate_and_store_qr(short_code):
"""
Generate QR code for a short URL and store in S3.
Returns S3 public URL.
"""
short_url = f"https://tinyurl.com/{short_code}"
qr = qrcode.QRCode(
version=2, # 25x25 modules, fits 7-char URL fine
error_correction=qrcode.constants.ERROR_CORRECT_M, # 15% damage tolerance
box_size=10,
border=4,
)
qr.add_data(short_url)
qr.make(fit=True)
img = qr.make_image(fill_color="black", back_color="white")
buffer = BytesIO()
img.save(buffer, format='PNG')
buffer.seek(0)
shard = int(short_code[0], 36) % 16 # distribute across 16 S3 prefixes
s3_key = f"qr/{shard:02d}/{short_code}.png"
s3_client.upload_fileobj(
buffer, QR_BUCKET, s3_key,
ExtraArgs={'ContentType': 'image/png', 'CacheControl': 'max-age=31536000'}
)
return f"https://cdn.tinyurl.com/{s3_key}"
Step 6: Bulk URL Shortening API
Marketing and affiliate teams create hundreds to thousands of short URLs at once. A bulk API prevents repeated single-URL HTTP overhead.
POST /api/v1/bulk-shorten
Authorization: Bearer {api_key}
Content-Type: application/json
{
"urls": [
{"long_url": "https://example.com/product/1", "custom_alias": "prod1"},
{"long_url": "https://example.com/product/2"},
...
],
"generate_qr": true,
"expires_at": "2027-01-01"
}
Response (synchronous for <= 100 URLs, async for > 100):
{
"job_id": "bulk_abc123",
"status": "processing", // or "complete" for small batches
"results": [...] // populated when status = "complete"
}
Polling:
GET /api/v1/bulk/{job_id}/status
Response:
{
"job_id": "bulk_abc123",
"status": "complete",
"total": 1000,
"completed": 1000,
"failed": 2,
"results": [
{"input_url": "...", "short_url": "https://tinyurl.com/abc1234", "qr_url": "..."},
...
]
}
For batches over 100 URLs, the application server inserts all records via a single batch INSERT, dispatches a Kafka event, and returns the job_id immediately. Workers process QR generation in parallel (fan-out by short_code).
Step 7: System Architecture
[Client Browser / API Consumer]
|
| GET /abc1234 (redirect)
| POST /api/v1/shorten (creation)
v
[CloudFront CDN]
(caches 301 redirects for static/non-expiring URLs)
(serves QR PNGs from S3 origin)
|
[Application Load Balancer]
|
[URL Service (stateless, N instances)]
| | |
| v v
| [Redis Cluster] [Bloom Filter (Redis BITFIELD)]
| (short->long cache, TTL 24h)
|
v
[MySQL Cluster (4 shards)]
Shard 0: short_code starting with [a-p]
Shard 1: short_code starting with [q-z][A-M]
Shard 2: short_code starting with [N-Z][0-4]
Shard 3: short_code starting with [5-9] + vanity
(each shard: 1 primary + 2 read replicas)
|
v
[Kafka]
topic: qr-generate (QR worker pool)
topic: click-events (analytics consumer)
|
v
[QR Worker Pool] -> [S3] -> [CloudFront QR CDN]
[Analytics Consumer] -> [ClickHouse]
Step 8: Redirect Flow
The redirect path must be the fastest path in the system. Every millisecond of latency affects all redirect traffic.
GET /abc1234
1. CloudFront edge cache check
Hit (cached 301): immediate redirect, 0 server involvement
Miss: forward to origin
2. URL Service: Redis GET "url:abc1234"
Hit: HTTP 302, Location: {long_url}, <5ms total
Miss: continue
3. MySQL read replica (nearest shard):
SELECT long_url, expires_at, is_active
WHERE short_code = 'abc1234'
Not found: HTTP 404
Found, expired: HTTP 410 Gone
Found, inactive: HTTP 410 Gone
Found, active: cache in Redis (TTL 24h), HTTP 302
4. Async (non-blocking): publish to Kafka click-events
{short_code, timestamp, ip_country, referrer, user_agent}
Do NOT block redirect for analytics write
301 vs 302 decision for TinyURL:
TinyURL historically used 301 (permanent) redirects for caching efficiency. If analytics (click tracking) are required -- which TinyURL Pro offers -- the service switches to 302 per short URL. The application layer tracks this per-URL: redirect_type column in url_mappings (default 302 for pro users, 301 for free/anonymous).
Step 9: Multi-Region Architecture
TinyURL serves global traffic. Single-region architecture introduces latency for users far from the origin.
Active-Active Multi-Region:
US-EAST (primary writes) EU-WEST (primary writes) AP-SOUTH (primary writes)
MySQL primary MySQL primary MySQL primary
Redis cluster Redis cluster Redis cluster
| | |
|<--- async replication ------>|<--- async replication ------>|
DNS GeoDNS: routes users to nearest region's ALB
US users -> US-EAST
EU users -> EU-WEST
APAC users -> AP-SOUTH
Write consistency:
Short URL creation goes to user's nearest region.
MySQL async replication propagates to other regions in < 500ms.
For 500ms window: a code created in US-EAST is not yet visible in EU-WEST.
Collision prevention: prefix short codes by region flag bit.
US codes: start with [a-t] (20/62 chars = 1/3 of space)
EU codes: start with [u-z][A-N] (20/62 chars)
AP codes: start with [O-Z][0-9] (remaining)
This partitions the code space, eliminating cross-region collisions entirely.
Tradeoffs Summary
| Decision | Choice | Tradeoff |
|---|---|---|
| Code generation | Random base62, 7 chars | Simple distribution, requires collision handling |
| Collision defense | Bloom filter + DB unique constraint + retry | 3-layer defense, minimal overhead at scale |
| Vanity URLs | DB unique constraint + reserved word list | Extra DB check on creation, O(1) lookup |
| QR generation | Sync (single) / async (bulk) | Sync adds 50ms to single creation; async decouples bulk |
| Redirect type | 302 default (pro), 301 (free/anon) | 302 enables analytics; 301 reduces server load |
| Multi-region | Code space partitioning by region | Eliminates cross-region collisions, slightly reduces per-region code space |
| Sharding | CRC32(short_code) % 4 | Predictable shard routing, rebalancing requires rehashing |
TinyURL vs bit.ly: Architecture Comparison
Both are URL shorteners. Their architectural choices diverge at the ID generation layer and cascade into different tradeoffs throughout the system.
TinyURL chooses random codes because they are easier to distribute without a centralized counter service. Any server in any region can generate a code without coordinating with other servers. The cost is collision handling, which requires a Bloom filter and retry logic.
bit.ly chooses sequential IDs encoded as base62. This eliminates collisions entirely: a central counter (or distributed Snowflake ID generator) issues unique IDs, and base62 encoding converts the integer to a short string. The cost is that all creation requests must contact the counter service, which becomes a coordination bottleneck at very high write rates.
For most production scales, both approaches work. At extreme write rates (above 100K creates/sec), bit.ly's approach needs a distributed counter (Snowflake or ticket server) to avoid the counter becoming a single point of failure. TinyURL's approach scales horizontally without coordination at the cost of the Bloom filter memory footprint.
Related Articles
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Explore this topic cluster
More resources in Uncategorized
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
Paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.
Submit your story →Ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.
Start Free Mock Test →