25 May 2025

Debunking the Myth: Yes, MySQL Can Handle 20 Million+ Records in a Single Table

by kan01234

💡 Debunking the Myth: Yes, MySQL Can Handle 20 Million+ Records in a Single Table

🚀 Introduction

One of the most common concerns I hear when building data-intensive applications is:

"Can MySQL really handle tens of millions of rows in a single table?"

Spoiler: Yes, it can. In fact, MySQL with the InnoDB storage engine can handle hundreds of millions of records — and beyond — without breaking a sweat.

Let’s dig into why this works under the hood, and why you shouldn’t worry about 20M records in a well-designed MySQL table.

🧱 Understanding InnoDB and B+Tree Storage

MySQL’s InnoDB engine stores table data in a structure called a B+Tree, especially when you’re using a primary key or clustered index.

Each leaf node in the B+Tree stores actual row data, while internal nodes store index entries — just enough info to guide the search down to the right leaf page.

Here’s how data is structured with default settings:

Page size: 16KB
Row size: ~1KB (assumed for this example)
Each leaf page: can store ~16 records
Each internal node entry: 8-byte key + 6-byte pointer = 14 bytes
Each internal node: fits ~1170 child pointers

📊 Visualization of a B+Tree (Height = 3)

                          [ Root Page ]
                          +-----------+
                          | key | ptr |
                          | ...       |  ➜ Up to ~1170 entries
                          +-----------+
                               |
      -------------------------------------------------------
      |                 |                 |                 |
[ Internal Page ]  [ Internal Page ]  [ Internal Page ]  ...
+-----------+      +-----------+      +-----------+
| key | ptr |      | key | ptr |      | key | ptr |
| ...       |      | ...       |      | ...       |
+-----------+      +-----------+      +-----------+
      |                 |                 |
  ------------      ------------      ------------
  |    |    |        |    |    |      ... (Leaf Pages)
[Leaf][Leaf][Leaf]  [Leaf][Leaf][Leaf]

Each Leaf Page:
+--------------------+
| Row #1 (1KB)       |
| Row #2 (1KB)       |
| ... (up to 16 rows)|
+--------------------+

🌱 Scaling Capacity with Height

Let’s break it down:

Leaf node stores: 16 rows/page
Internal node fans out to: 1170 children

Tree Height	Leaf Pages	Max Records (1KB each)
2	1170	18,720
3	1170² = 1.37M	21.9M
4	1170³ = 1.6B	~25.6B

🔍 Note: All rows are accessed through the leaf layer. The more records you store, the taller your tree — but tree height increases slowly due to exponential fan-out.

🌳 How Tree Height Affects Capacity

B+Trees scale exponentially with depth. Here’s what that looks like in practice:

Tree Height	Internal Fan-out	Leaf Pages	Records per Page	Max Rows
2	1,170	1,170	16	18,720
3	1,170²	1.36M	16	21.9M
4	1,170³	1.6B	16	~25.6B

✅ 20 million records? That’s just a height-3 B+Tree — easily manageable with only 3 page reads per lookup (if no caching is used).

📉 Disk I/O vs Performance

Tree height directly affects I/O, as each level requires reading one page from disk:

Read a record = Traverse root → internal → leaf
Insert a record = Same path, plus potential page split

Even with 20M+ rows, your B+Tree may only be 3 levels tall. That’s 3 page reads per lookup — and in real systems, these pages are often cached in memory via the InnoDB Buffer Pool.

So performance can remain snappy — especially with a good primary key design and indexes.

🧪 Real-World Experience

I’ve personally worked on MySQL tables with:

50M+ rows
Sub-second query performance (with the right indexes)
High insert/update throughput

There was no need for sharding or exotic databases — just solid schema design and InnoDB’s defaults.

🔍 Summary: No, There’s No 20M Row Limit

Let’s bust the myth for good.

❌ "MySQL can't handle 20 million rows in a single table."

✅ Truth: MySQL (with InnoDB) can handle hundreds of millions of rows, thanks to B+Tree indexing, 16KB pages, and efficient I/O design.

If you’ve hit a performance bottleneck, it’s more likely due to:

Missing indexes
Unoptimized queries
Poor schema design
Insufficient memory for buffer pool

Not because MySQL has some arbitrary row limit — it doesn’t.

tags: database - mysql - innodb