Don’t make this OUTER JOIN mistake!

SQL Server LEFT JOIN mistake featured image
Reading Time: 4 minutes

The OUTER JOIN operations, such as LEFT JOIN and RIGHT JOIN, are very important to know when it comes to querying Microsoft SQL Server databases. There is one important factoid you should know if you are using an OUTER JOIN in combination with the IS NULL predicate.



Let’s create a bit of data to work with. We’ll start with a simple Books table:

CREATE TABLE Books
(
BookID INT,
Title VARCHAR(50),
Author VARCHAR(30),
Pages SMALLINT
)

INSERT INTO Books (BookID, Title, Author, Pages)
VALUES (100, 'Brain On Fire', 'Cahalan', 288),
(110, 'Fight Club', 'Palahniuk', 224),
(120, 'The Road', 'McCarthy', 287),
(130, 'The Hobbit', 'Tolkien', 304),
(140, 'Deep Work', 'Newport', 283),
(150, 'Stop Acting Rich', 'Stanley', 288)

Then we’ll create a Sales table:

CREATE TABLE Sales
(
SaleID INT IDENTITY NOT NULL,
BookID INT NOT NULL,
OrderDate DATETIME NULL,
CustomerID VARCHAR(3) NULL
)

INSERT INTO Sales (BookID, OrderDate, CustomerID)
VALUES (110, '2023-11-03', 'GH1'),
(100, '2023-11-12', 'KR1'),
(100, '2023-11-19', 'BM1'),
(140, NULL, 'HW1'),
(130, NULL, 'LP1'),
(140, '2023-12-05', 'AN1')

Here’s what the table content should look like:

SQL Server LEFT JOIN mistake the data

The idea of the Sales table is to store details about each book sale we make. Notice it has a BookID column, which is meant to be a foreign link back to the Books table to tell us what book was purchased.

What if we wanted to find out what books have never been sold? We would basically want to know what books don’t have a reference in the Sales table.

We can start with a LEFT JOIN query, like the following:

SELECT B.BookID, B.Title, B.Author, S.SaleID, S.OrderDate
FROM Books AS B
LEFT JOIN Sales AS S
ON B.BookID = S.BookID

This gives us the following result set:

SQL Server LEFT JOIN mistake left join query

Notice the two columns we’re pulling from the right table: SaleID and OrderDate. If we wanted to isolate which books simply don’t have a reference in the Sales table, we have the choice of using one of these columns in an IS NULL predicate.

If a book doesn’t have any sales, it won’t have a SaleID nor will it have an OrderDate. For those books, the column values will be NULL in our query. So, let’s use the SaleID column in an IS NULL predicate to isolate those outer rows:

sql server left join mistake IS NULL with SaleID

So it looks like our books “The Road” and “Stop Acting Rich” simply don’t have any sales!

There is one important point you need to understand about this query….

When you use IS NULL to isolate outer rows, you need to make sure you outline a column that is non-nullable

Earlier, I said we have the choice of using either SaleID or OrderDate in our IS NULL predicate, but truthfully, only one of them is the correct column to use.

We want to make sure we use SaleID because it is non-nullable. We need to think about our earlier result set (which doesn’t use IS NULL yet):

SQL Server LEFT JOIN mistake left join query

We understand that the SaleID column in our Sales table is non-nullable, yet NULL appears twice in that column in this result set.

In this case, NULL is a placeholder since there is simply nothing to put in that column for the books “The Road” and “Stop Acting Rich” because nobody has bought those two books.

But consider the OrderDate column, which is also from the Sales table. For our books “The Road” and “Stop Acting Rich“, we see the placeholder NULLs also.

But also for the OrderDate column, there are instances where NULL is not a placeholder, but the actual value in the table. This appears to be the case for Sale # 4 and # 5.

Using OrderDate in our IS NULL predicate, therefore, would be the WRONG thing to do!:

SQL Server LEFT JOIN mistake using nullable column in query

If we were trying to gather what books haven’t been sold, this list of books is an incorrect representation.

You want to pick a column that is non-nullable to use in your IS NULL predicate because then you know if NULL appears in the column, it represents a placeholder and not the actual value in the column!



Next steps:

Leave a comment if you found this tutorial helpful!

Check out the full beginner-friendly tutorial on NULL to learn more about the IS NULL predicate and other factoids you should know:

SQL Server NULL: Are you making these 7 mistakes?

Also, learn why we need to use IS NULL and not simply ” = NULL” in the following tutorial:

SQL Server IS NULL: Why can’t I just use “= NULL”?



Thank you very much for reading!

Make sure you subscribe to my newsletter to receive special offers and notifications anytime a new tutorial is released!

If you have any questions, or if you are struggling with a different topic related to SQL Server, I’d be happy to discuss it. Leave a comment or visit my contact page and send me an email!

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *