A backend engineer's journey of learning and growth.
by kan01234
When we write a query like:
SELECT * FROM users WHERE id = 123;
MySQL (with the InnoDB storage engine) doesn’t just “magically” find the row. Under the hood, it performs a well-optimized journey: navigating a B+Tree index, diving into a file page, using binary search with a page directory, and finally walking a linked list of records to reach the exact row.
In this post, we’ll peel back the layers of how InnoDB index lookups really work. By the end, you’ll understand not just that indexes speed up queries, but how MySQL finds your data so efficiently.
InnoDB stores indexes as a B+Tree. Each node is a 16KB page, and the tree ensures that:
Think of the B+Tree as a map that narrows down your search step by step.
[50 | 100 | 150] <-- Root (internal node)
/ | \
[10..49] [51..99] [101..199] <-- Child nodes (leaf pages)
If you search for id=123, the traversal goes:
Each node in the B+Tree corresponds to a 16KB file page inside InnoDB’s tablespace. When InnoDB traverses the B+Tree:
This is why index access patterns are crucial: well-designed indexes reduce unnecessary random I/O.
Once InnoDB lands on the correct leaf page, it doesn’t immediately know where your key is. Each page maintains its entries in a sorted linked list by key.
Page [101..199]
┌─────────────────────────────────┐
│ 101 -> 110 -> 123 -> 150 -> 180 │
└─────────────────────────────────┘
To optimize, InnoDB uses a page directory (slots at the end of the page) to speed up binary search instead of scanning the list one by one.
So the search process inside a page is:
Finally, when the matching key is found:
For a clustered index (PRIMARY KEY): The record itself (all columns) is stored in the leaf page entry. The search stops here.
For a secondary index: The leaf entry stores the secondary key + a pointer (the PRIMARY KEY). InnoDB must then do another lookup in the clustered index to fetch the full row.
So, fetching a record from a secondary index is effectively:
Secondary B+Tree lookup → Clustered B+Tree lookup → Row data.
To tie it together, here’s how a single leaf page is structured:
Leaf Page (16KB)
+-------------------------------+
| Infimum record |
| Record: id=101 |
| Record: id=110 |
| Record: id=123 | <-- linked list of user records
| Record: id=150 |
| Record: id=180 |
| Supremum record |
+-------------------------------+
| Free space (for inserts) |
+-------------------------------+
| Page Directory (array of slots)
| slot0 -> points to 101
| slot1 -> points to 123
| slot2 -> points to 180
+-------------------------------+
Lookup process inside the page:
Here’s the whole journey put together:
Query: SELECT * FROM users WHERE id = 123;
[B+Tree Root Page]
│
▼
[Leaf Page containing 123]
│
▼
[Page Directory → Linked List]
│
▼
[Record: id=123, name="Alice", age=30]
For a secondary index:
[B+Tree (Secondary Index)] → find key "email=alice@example.com"
│
▼
[Pointer to Primary Key (id=123)]
│
▼
[Clustered Index B+Tree (PRIMARY KEY)]
│
▼
[Record: id=123, name="Alice", age=30]
To recap, here’s how InnoDB uses indexes to fetch an exact record:
✅ In short: Binary search (B+Tree) → Binary search (Page Directory) → Linked List iteration → Record
Key Takeaways
Understanding this flow explains why good index design is critical for query performance.
tags: mysql, - performance, - data-structures, - innodb