Design a URL Shortener (TinyURL)
Step-by-step guide to the TinyURL system design interview question: requirements, capacity estimation, base62 encoding, database schema, caching, and the trade-offs interviewers probe. Includes an interactive encoding simulator.
Why interviewers ask this question
A URL shortener takes a long URL and returns a short alias (like bit.ly/3xYzAbc) that redirects to the original. It looks trivial — and that is exactly why interviewers love it. In 45 minutes it tests whether you can clarify requirements, estimate scale, design a clean API, pick a database with a reason, and handle a read-heavy workload with caching.
It is the most common warm-up question at every level. Junior candidates are expected to get a working design; senior candidates are expected to nail the trade-offs: encoding strategy, collision handling, and what happens at 100M redirects per day.
The 30-second answer
Step 1 — Requirements
Functional requirements
- Given a long URL, generate a short, unique alias.
- Visiting the short URL redirects to the original URL.
- Links can have an expiration time (default + user-specified).
- (Clarify) Custom aliases — nice-to-have, easy to support.
- (Clarify) Click analytics — call it out, then explicitly de-scope it.
Non-functional requirements
- High availability. If redirection is down, every shared link on the internet breaks.
- Low latency on redirects — this is on the critical path of a user click.
- Short codes should not be predictable (guessing sequential codes leaks private links).
- Read-heavy: assume a 100:1 read-to-write ratio.
Step 2 — Capacity estimation
Assume a bit.ly-scale service:
- Writes: 5M new URLs/day ≈
5M / 86,400s≈ ~60 writes/sec (peak ~300/sec). - Reads: 100M redirects/day ≈ ~1,200 reads/sec (peak ~6,000/sec).
- Storage: 5M/day × 365 × 5 years ≈ 9B records. At ~500 bytes each ≈ ~4.5 TB — small enough that storage is not the hard part.
- Cache: follow the 80/20 rule — 20% of URLs drive 80% of traffic. Caching one day of hot redirects ≈ 20M × 500B ≈ ~10 GB. That fits on a single Redis node.
Why bother with numbers?
Loading visualization...
Step 3 — API design
Two endpoints cover the core product: create a short URL, and redirect.
POST /api/v1/urls
{
"longUrl": "https://example.com/very/long/path?with=params",
"customAlias": "launch2026", // optional
"expiresAt": "2027-01-01" // optional
}
201 Created
{ "shortUrl": "https://tny.app/launch2026" }GET /{shortCode}
302 Found
Location: https://example.com/very/long/path?with=params301 vs 302 — a classic follow-up
Step 4 — Generating the short code (the core of the interview)
This is where the interview is won or lost. You need a function that maps each long URL to a short, unique, hard-to-guess code.
How long should the code be? With base62 (a-z, A-Z, 0-9), a 7-character code gives 62^7 ≈ 3.5 trillion combinations — enough for ~9B URLs over 5 years with lots of headroom.
There are three standard approaches:
| Approach | How it works | Pros | Cons |
|---|---|---|---|
| Hash + truncate | MD5/SHA the long URL, take the first 7 base62 chars | Stateless, same URL → same code | Collisions need retry loops; birthday paradox bites at scale |
| Auto-increment counter + base62 | Global counter; encode the ID in base62 | Zero collisions, trivially simple | Sequential codes are guessable; counter is a single point of contention |
| Key Generation Service (KGS) | Pre-generate random unique keys offline; hand them out from a pool | No collisions at request time, not guessable, fast | One more service to run; keys must be marked used atomically |
A strong answer: start with counter + base62 for simplicity, then address its two weaknesses — add a random offset or shuffle to stop guessability, and shard the counter (e.g., each server leases ranges like 1–1M, 1M–2M from ZooKeeper or a ticket table) to remove the contention point. Mentioning the KGS pattern as the "productionized" version signals depth.
Loading visualization...
Step 5 — Database and schema
The data model is a single mapping table — this workload is a textbook key-value lookup:
CREATE TABLE urls (
short_code VARCHAR(10) PRIMARY KEY, -- the base62 code
long_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT now(),
expires_at TIMESTAMP,
user_id BIGINT -- for quotas / analytics
);SQL or NoSQL? Either works, and saying so earns points — what matters is the reasoning. There are no joins, no transactions across rows, and simple access patterns, so a key-value or wide-column store (DynamoDB, Cassandra) scales horizontally with no ceremony. A single Postgres instance also comfortably handles 60 writes/sec and, with read replicas plus the cache absorbing most reads, this never becomes the bottleneck. Pick one, justify it, and move on — spending ten minutes agonizing here is a common mistake.
Step 6 — The redirect path: caching for 100M reads/day
The redirect path is: GET /{code} → load balancer → app server → cache → database (on miss).
- Use cache-aside: check Redis for the code; on miss, read the DB and populate the cache with a TTL.
- ~10 GB of hot mappings means the vast majority of redirects never touch the database.
- URL mappings are immutable (a code never re-points to a different URL), which makes caching gloriously simple — no invalidation problem, the hardest problem in caching just does not exist here.
- For expired links, check
expires_atbefore redirecting and return410 Gone.
Tradeoff: Adding a cache layer
- Sub-millisecond redirects for hot links
- Database load drops by ~90%+ for a read-heavy workload
- Immutable data means no cache invalidation complexity
- One more component to operate and monitor
- Cold start / cache failure sends a thundering herd to the DB — mention request coalescing
- Slightly stale reads on expiry (acceptable: seconds of staleness on a dead link)
Common mistakes that cost offers
- Jumping straight to architecture without stating the read:write ratio — the whole design follows from it.
- Hand-waving the encoding: "I'll just hash it" with no collision story is the most common failure on this question.
- Answering 301 without mentioning analytics — it signals you have not thought about why the product exists.
- Over-engineering: proposing Kafka, microservices, and multi-region for 60 writes/sec. Scale the design to the numbers you estimated.
- Forgetting expiration cleanup — a lazy check at read time plus a periodic batch delete is a perfectly good answer.
Frequently asked questions
Is the TinyURL question easy?
It is rated Easy because a working design is straightforward, but it has depth: encoding strategy, collision handling, 301 vs 302, and cache design separate junior from senior answers. It is the most frequently asked system design question, so it is worth mastering completely.
How long should a short URL code be?
Seven base62 characters give about 3.5 trillion unique combinations, which covers roughly 9 billion URLs over five years with large headroom. Six characters (57 billion combinations) also works for smaller scale assumptions.
Should I use a 301 or 302 redirect for a URL shortener?
Use a 302 (temporary) redirect if you need click analytics or the ability to expire and change links, because every click passes through your servers. A 301 (permanent) lets browsers cache the redirect, which reduces server load but permanently gives up analytics and control.
Should I use SQL or NoSQL for a URL shortener?
Both are defensible. The workload is a simple key-value lookup with no joins or multi-row transactions, so a key-value store like DynamoDB scales naturally; a single relational database with read replicas and a cache in front also easily handles the write volume. Interviewers grade the justification, not the brand name.
Reading only gets you halfway
Practice designing a URL Shortener (TinyURL) step by step with an AI interviewer that evaluates your answers — free, no credit card.
Practice this problem free