Why UUIDs Beat Integers as Primary Keys (And Why Performance Isn’t the Issue)
by darren horrocksOne of the first schema decisions you face when designing a database table is:
Should I use an INT or a UUID as the primary key?
Most developers default to an auto-incrementing integer. It’s simple, compact, and familiar. But UUIDs (a.k.a. GUIDs) are increasingly popular — and many of the old performance objections don’t hold up with modern hardware.
Let’s break this down like developers, not DB theorists.
What’s a UUID, Really?
At the end of the day, a UUID is just a number. Specifically: a 128-bit number.
Compare that to the types you already know:
| Type | Size (bits) | Size (bytes) | Max Values |
|---|---|---|---|
INT |
32 | 4 | ~4.3B |
BIGINT |
64 | 8 | ~9.2 quintillion |
UUID |
128 | 16 | ~3.4 × 10³⁸ |
That’s it. No magic. Just more space.
Why Developers Like UUIDs
1. Generate Anywhere
With integers, the database usually assigns the ID:
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50)
);
You don’t know the ID until after the row is inserted. With UUIDs, you can generate the ID in your application:
CREATE TABLE users (
id UUID PRIMARY KEY,
name VARCHAR(50)
);
Then in code:
import uuid
user_id = uuid.uuid4()
No database round trip just to get a key.
2. Merge Without Collisions
If you’ve ever tried merging data from two databases that both used AUTO_INCREMENT, you know the pain of ID collisions.
-- From DB A
id | name
---+-------
1 | Alice
2 | Bob
-- From DB B
id | name
---+-------
1 | Charlie
2 | Dana
Good luck merging those.
With UUIDs:
id | name
-------------------------------------+-------
550e8400-e29b-41d4-a716-446655440000 | Alice
ddeb27fb-d9a0-4624-be4d-4615062daed4 | Bob
d8f1f8d4-4b0f-11ee-be56-0242ac120002 | Charlie
42c9aeea-4b0f-11ee-be56-0242ac120002 | Dana
No collisions. Merge away.
3. Safer URLs
Ever seen a URL like /user/12345?
Anyone can guess /user/12346.
UUIDs make this harder:
/user/550e8400-e29b-41d4-a716-446655440000
Much less guessable.
The Performance Myth
Here’s the common objection:
“But UUIDs are so much bigger than integers. Won’t that kill performance?”
Not really.
INT= 4 bytesBIGINT= 8 bytesUUID= 16 bytes
That’s 12 extra bytes per row compared to an INT. For perspective, a single VARCHAR(255) column can take 255 bytes — orders of magnitude more than the difference between keys.
Modern CPUs handle 128-bit comparisons natively. Databases already juggle kilobytes of row data per query. The cost of comparing two UUIDs instead of two integers is microscopic compared to disk I/O, joins, or network latency.
Real-world bottlenecks are almost never the 16-byte key.
Things to Watch Out For
That said, UUIDs aren’t free candy:
- Bigger indexes → Indexes on UUID columns take more space. If you’re storing billions of rows, that matters.
- Fragmentation → Random UUIDs don’t insert sequentially. This can cause page splits in clustered indexes. Solutions: use UUIDv1 or database-specific functions like
UUID_TO_BIN()(MySQL) that produce more sequential values. - Readability → Debugging
12345is easier than550e8400-e29b-41d4-a716-446655440000. Most teams solve this with tooling/log formatting.
Rule of Thumb for Developers
-
Use
INTorBIGINTif:- Your app is small and won’t need cross-database merges.
- You want compact storage and don’t care about guessable IDs.
-
Use
UUIDif:- You’re building distributed or microservice systems.
- You need to merge data from multiple sources.
- You want to generate keys outside the database.
- You want harder-to-guess identifiers.
Benchmark: INT vs UUID in PostgreSQL
Let’s test with two identical tables:
-- INT-based table
CREATE TABLE users_int (
id SERIAL PRIMARY KEY,
name TEXT
);
-- UUID-based table
CREATE TABLE users_uuid (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT
);
We’ll insert 1 million rows into each:
-- Insert into INT table
INSERT INTO users_int (name)
SELECT 'user_' || g
FROM generate_series(1, 1000000) g;
-- Insert into UUID table
INSERT INTO users_uuid (name)
SELECT 'user_' || g
FROM generate_series(1, 1000000) g;
Query Performance
Point lookup by primary key:
EXPLAIN ANALYZE
SELECT * FROM users_int WHERE id = 500000;
Output (abridged):
Index Scan using users_int_pkey on users_int
(cost=0.42..8.44 rows=1 width=36)
(actual time=0.018 ms)
Now with UUID:
EXPLAIN ANALYZE
SELECT * FROM users_uuid
WHERE id = '550e8400-e29b-41d4-a716-446655440000';
Output (abridged):
Index Scan using users_uuid_pkey on users_uuid
(cost=0.42..8.44 rows=1 width=52)
(actual time=0.019 ms)
The difference? Basically zero. Both queries run in ~0.02ms.
Index Size
Let’s compare index sizes:
SELECT
relname AS index_name,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_all_indexes
WHERE relname LIKE 'users_%_pkey';
Sample output:
| index_name | index_size |
|---|---|
| users_int_pkey | 35 MB |
| users_uuid_pkey | 70 MB |
Yes, the UUID index is bigger (about double, which makes sense: 16 bytes vs 8). But even at 1 million rows, we’re only talking tens of MB — trivial on modern systems.
Takeaway for Developers
- Lookup speed: Identical.
- Insert speed: Nearly identical (UUIDs may fragment clustered indexes if totally random, but UUIDv1/sequential UUIDs solve this).
- Storage: UUIDs take more index space, but unless you’re working with billions of rows, it’s not the bottleneck.
The actual performance bottlenecks will almost always be query design, joins, and I/O — not whether your primary key is 4 bytes or 16.
So when someone says “UUIDs are slow”, you can point at query plans and actual numbers: they’re not.
TL;DR
- A UUID is just a 128-bit number, the same way an
INTis a 32-bit number. - The performance hit is negligible on modern hardware.
- UUIDs unlock distributed key generation, collision-free merges, and safer URLs.
- Integers are fine for simple, single-database apps — but UUIDs future-proof your schema.