kan01234 - Software Engineer Notes

Logo

A backend engineer's journey of learning and growth.

View the Project on GitHub kan01234/post

25 May 2025

Debunking the Myth: Yes, MySQL Can Handle 20 Million+ Records in a Single Table

by kan01234

đź’ˇ Debunking the Myth: Yes, MySQL Can Handle 20 Million+ Records in a Single Table

🚀 Introduction

One of the most common concerns I hear when building data-intensive applications is:

"Can MySQL really handle tens of millions of rows in a single table?"

Spoiler: Yes, it can. In fact, MySQL with the InnoDB storage engine can handle hundreds of millions of records — and beyond — without breaking a sweat.

Let’s dig into why this works under the hood, and why you shouldn’t worry about 20M records in a well-designed MySQL table.

đź§± Understanding InnoDB and B+Tree Storage

MySQL’s InnoDB engine stores table data in a structure called a B+Tree, especially when you’re using a primary key or clustered index.

Each leaf node in the B+Tree stores actual row data, while internal nodes store index entries — just enough info to guide the search down to the right leaf page.

Here’s how data is structured with default settings:

📊 Visualization of a B+Tree (Height = 3)

                          [ Root Page ]
                          +-----------+
                          | key | ptr |
                          | ...       |  âžś Up to ~1170 entries
                          +-----------+
                               |
      -------------------------------------------------------
      |                 |                 |                 |
[ Internal Page ]  [ Internal Page ]  [ Internal Page ]  ...
+-----------+      +-----------+      +-----------+
| key | ptr |      | key | ptr |      | key | ptr |
| ...       |      | ...       |      | ...       |
+-----------+      +-----------+      +-----------+
      |                 |                 |
  ------------      ------------      ------------
  |    |    |        |    |    |      ... (Leaf Pages)
[Leaf][Leaf][Leaf]  [Leaf][Leaf][Leaf]

Each Leaf Page:
+--------------------+
| Row #1 (1KB)       |
| Row #2 (1KB)       |
| ... (up to 16 rows)|
+--------------------+

🌱 Scaling Capacity with Height

Let’s break it down:

Tree Height Leaf Pages Max Records (1KB each)
2 1170 18,720
3 1170² = 1.37M 21.9M
4 1170Âł = 1.6B ~25.6B

🔍 Note: All rows are accessed through the leaf layer. The more records you store, the taller your tree — but tree height increases slowly due to exponential fan-out.

🌳 How Tree Height Affects Capacity

B+Trees scale exponentially with depth. Here’s what that looks like in practice:

Tree Height Internal Fan-out Leaf Pages Records per Page Max Rows
2 1,170 1,170 16 18,720
3 1,170² 1.36M 16 21.9M
4 1,170Âł 1.6B 16 ~25.6B

✅ 20 million records? That’s just a height-3 B+Tree — easily manageable with only 3 page reads per lookup (if no caching is used).

📉 Disk I/O vs Performance

Tree height directly affects I/O, as each level requires reading one page from disk:

Even with 20M+ rows, your B+Tree may only be 3 levels tall. That’s 3 page reads per lookup — and in real systems, these pages are often cached in memory via the InnoDB Buffer Pool.

So performance can remain snappy — especially with a good primary key design and indexes.

đź§Ş Real-World Experience

I’ve personally worked on MySQL tables with:

There was no need for sharding or exotic databases — just solid schema design and InnoDB’s defaults.

🔍 Summary: No, There’s No 20M Row Limit

Let’s bust the myth for good.

❌ "MySQL can't handle 20 million rows in a single table."

âś… Truth: MySQL (with InnoDB) can handle hundreds of millions of rows, thanks to B+Tree indexing, 16KB pages, and efficient I/O design.

If you’ve hit a performance bottleneck, it’s more likely due to:

Not because MySQL has some arbitrary row limit — it doesn’t.

tags: database - mysql - innodb