The SQL Server “Key Lookup” operation is something you may see when looking at the execution plan of a query. Maybe you’re trying to optimize the query to run a bit faster and noticing that a lookup operation is taking a LARGE amount of the query processing time. Or maybe you’re simply trying to understand what’s happening “under the hood” when your query is executing.
In this brief tutorial, we’ll discuss what the SQL Server Key Lookup operation is and how it can give us a clue about what’s happening when our query is executing and returning data.
We’ll discuss these topics:
- What is a Key Lookup?
- A helpful analogy
- Demonstrating a Key Lookup operation
- “Including” columns in our nonclustered index will eliminate the Key Lookup operation
- Links
Let’s get into it:
1. What is a Key Lookup?
To understand what a key lookup is, we need to understand a-thing-or-two about nonclustered indexes in SQL Server. Key lookups are directly tied to nonclustered indexes.
A key lookup occurs when SQL Server is gathering data for a query via a nonclustered index. Sometimes, the nonclustered index doesn’t contain absolutely everything the query is asking for, so SQL Server needs to walk away from the nonclustered index data structure to the disk to retrieve the rest of the data.
This “walking away” from the nonclustered index data structure to disk is called a lookup.
2. A helpful analogy
There is a great analogy I always think of when it comes to understanding nonclustered indexes and lookups in SQL Server.
Imagine you walk into a bookstore and you’re looking for a specific book with the title “Great Expectations“.
Let’s say you don’t know who wrote the book. If you’ve ever been in a bookstore or a library, you know books are on the shelf in alphabetical order by author last name.
(Which is a good analogy for a clustered index)
But if we don’t know who wrote the book, having books on the shelf in order by author really doesn’t help us. We’re reduced to looking through all the books on the shelf, trying to find one that has the title “Great Expectations“.
What if this particular bookstore understands this problem and has a very helpful notebook at the front desk that contains a complete list of books in their inventory in alphabetical order by title. Next to each title is the author who wrote the book.
Using this notebook, we can easily locate our “Great Expectations” book, then see who wrote it. Once we know who wrote it, we can find the book on the shelf easily since books are ordered there by author!
The notebook represents the nonclustered index
Folks, this notebook represents the idea of a nonclustered index. Information is laid out in a data structure in a different order that may be useful to us.
Again, in order to properly understand what a lookup is, you need to know how a nonclustered index works. Check out the full tutorial if you want to learn more:
SQL Server Nonclustered Indexes: Complete guide for beginners
Okay, so how does a “lookup” fit into this analogy?
What if you weren’t interested in buying or reading our “Great Expectations” book, but simply wanted to know who wrote it and when it was published.
So you go to the notebook, find “Great Expectations” in the list (probably near the top, since the notebook contains titles in alphabetical order), and see that it was written by Dickens. Great, that satisfies the first item we wanted to gather: Who wrote it.
The other information we want is when it was published. Unfortunately, that information is not in the notebook. But since we now know it was written by Dickens, we can locate the book easily on the shelf (since books are ordered there by author, remember) and see when it was published.
We go over to our bookshelf, find Dickens, then see his “Great Expectations” book. We open it up and see that was published in 1861. Fantastic! We’ve gathered everything we wanted to know.
Folks, the idea of walking away from the notebook to the bookshelf is called a “lookup“. Not all the information we wanted was in our notebook, unfortunately, so we had to leave that and walk over to the shelf to get the rest of the information.
This is a similar process that SQL Server needs to do if it sees that not all the information a query is asking for is part of the nonclustered index. When that happens, SQL Server needs to leave the nonclustered index and go to disk to get the rest of the information.
3. Demonstrating a Key Lookup operation
Let’s actually create some data and show you what a lookup looks like. Here’s the code to create a Books table and add data to it:
CREATE TABLE Books ( BookID INT IDENTITY(10,1), Author VARCHAR(30), Title VARCHAR(50), PublicationYear INT ) INSERT INTO Books (Author, Title, PublicationYear) VALUES ('Fitzgerald', 'The Great Gatsby', 1925), ('Dickens', 'Great Expectations', 1861), ('Hemingway', 'The Old Man and the Sea', 1952), ('Morrison', 'Song of Solomon', 1977), ('Collins', 'The Hunger Games', 2008), ('Austen', 'Sense and Sensibility', 1811), ('Hugo', 'Les Misérables', 1862), ('Bradbury', 'Fahrenheit 451', 1953)
Let’s create two indexes on this table. The first is a clustered index:
CREATE CLUSTERED INDEX idx_Author ON Books(Author)
This clustered index is like books on the bookshelf. After this index is created, books will be in order on disk in alphabetical order by Author, just like how books are in alphabetical order by author on the shelves in a bookstore.
The next index will be a nonclustered index:
CREATE NONCLUSTERED INDEX idx_Title ON Books(Title)
This nonclustered index is like our notebook. It contains a list of books in alphabetical order by title. Each reference in the nonclustered index data structure tells us not only the Title of the book, but also the Author of the book.
Let’s run a query and force it to use this idx_Title index:
SELECT Author, PublicationYear FROM Books WITH (INDEX (idx_Title)) WHERE Title = 'Great Expectations'
Here’s what the output looks like:
To see the execution plan for this query, we need to enable an option is SSMS called “Include Actual Execution Plan“. It’s located here:
Once that is enabled, we can re-run our query and see the execution plan that SQL Server used to gather the data for the query:
We read the execution plan from right to left.
So SQL Server starts with an “Index Seek” operation, where it uses the idx_Title index we created to attempt to gather all the data that the query is asking for.
The nonclustered index data structure contains details about the Title of each book and the Author who wrote the book, like in our notebook analogy. But the query is asking for one more piece of information: The PublicationYear.
The PublicationYear is not presently part of the nonclustered index, so SQL Server needs to walk away from that index over to the clustered index to get that information. That process can be summarized in the “Key Lookup” operation we see in the execution plan.
(The “Nested Loops” operation is just SQL Server’s way of collecting all the data it has found and getting ready to present it in the output)
4. “Including” columns in our nonclustered index will eliminate the Key Lookup operation
Okay, now let’s talk about how to INCLUDE columns in a nonclustered index, which in this case will eliminate the Key Lookup operation.
What we need to do is drop our current index and re-create it with the new INCLUDE details. Here’s the DROP INDEX statement:
DROP INDEX idx_Title ON Books
Here’s the code to re-create the index and INCLUDE the PublicationYear column:
CREATE NONCLUSTERED INDEX idx_Title_New ON Books(Title) INCLUDE (PublicationYear)
Notice all that is different from the first time we created this index is that we added the INCLUDE keyword, followed by the column we want to include contained in parentheses.
Understand that you can INCLUDE more than one column if you want to. The list of columns in parentheses would just need to be separated by commas.
After we run that code to re-create our index, we can re-run our query. In the execution plan, we see that we have eliminated the Key Lookup operation!:
All we see is the “Index Seek” operation against our idx_Title index. This is basically telling us that SQL Server could gather everything the query is asking for from this index.
This is like if we started writing the publication year of each book in our notebook. If all we wanted to know was the Author and PublicationYear of a book, we can gather all that information from our notebook! No more “walking away” for us!
5. Links
You’ll probably find the following FREE Ebook helpful:
FREE Ebook on Sql Server Indexes!
This FREE Ebook contains absolutely everything you need to know about indexes in Microsoft SQL Server, including:
- Clustered Indexes
- Nonclustered Indexes
- Unique Indexes
- Understanding the B-tree
This Ebook will definitely be a great resource for you to keep and reference throughout your career as a data professional.
In it, we talk in great detail about nonclustered indexes and how to INCLUDE columns. It also discusses a downside to INCLUDE columns that you need to consider before adding them to your nonclustered indexes.
Download the Ebook today!
Next Steps:
Leave a comment if you found this tutorial helpful!
As mentioned earlier, you really need to understand how nonclustered indexes work in order to understand the idea of INCLUDE columns and the “Key Lookup” operation. Check out the full beginner-friendly tutorial to learn more:
Nonclustered Index in SQL Server: A Beginner’s Guide
Also, while we’re at it, you should understand how clustered indexes work, too. Check it out:
SQL Server Clustered Index: The Ultimate Guide for the Complete Beginner
Indexes are monumental towards writing efficient queries against our databases. Make sure you know how they work and how to create them!
Thank you very much for reading!
Make sure you subscribe to my newsletter to receive special offers and notifications anytime a new tutorial is released!
If you have any questions, or if you are struggling with a different topic related to SQL Server, I’d be happy to discuss it. Leave a comment or visit my contact page and send me an email!
Great explanation of Key Lookup! Thanks for this!
Thanks Mac! Glad you found it helpful.