A proper understanding of NULL is absolutely essential for writing accurate queries that return accurate results.
When you are first starting out with SQL Server, you might have these assumptions about NULL that are not quite right, or maybe flat out WRONG. If you misunderstand how SQL treats NULL, you might see results in your result set you didn’t expect.
You should always be aware of how your queries are going to treat NULLs, so that your queries return accurate data.
In this tutorial, we will list the 7 common misunderstandings people have regarding NULL, and give you the knowledge and tools you need to write accurate, meaningful queries.
Here’s the list of topics we will cover:
- Thinking NULL is the same as zero, or “nothing”
- Not understanding the result of comparison operators with NULL
- If you negate NULL, you still get NULL
- Not specifying nullability in your column definitions
- You should be accounting for NULL in your query results
- Set Operators are distinctness based, which treat NULL differently
- All aggregate functions ignore NULLs (except one…)
Don’t forget to download your FREE 1-PAGE Simple SQL Cheat Sheet on SQL Server NULL!
This guide contains all the key points you need to know about NULL condensed into a simple 1-page guide. It will be an excellent resource for you to reference during your career as a database professional. Make sure you download it today!
Starting from the top, the first (and most important) mistake is:
1. Thinking NULL is the same as zero, or “nothing”
Folks, NULL is not the same thing as zero, and it’s not “nothing” either.
It actually means “unknown“.Â
Let me ask you this question: What is Charlize Theron’s middle name? DON’T GOOGLE IT EITHER, cheater.Â
What do you mean “you don’t know”?
What if I asked you this question: Is Charlize Theron’s middle name Josephine?
You wouldn’t say “Yes” because you actually don’t know. And you wouldn’t say “No” either, because, I mean, it could be. But again, the appropriate answer is you simply don’t know.
If you say “I don’t know“, you’re also not saying she doesn’t have one. Of course she has a middle name……
But then again, some people don’t have a middle name. Maybe Charlize Theron is one of those people?
Now you’re starting to understand NULL.
Going back to SQL for a minute, let’s take the following query:
When I inserted these rows, I entered the information I knew at the time. I knew Elton John’s middle name is Hercules (seriously), and I knew Prince doesn’t have a middle or last name.
But Bruce Willis, Anne Hathaway, and Charlize Theron? I don’t know their middle names, and that’s ok. That is exactly what NULL is used for.
2. Not understanding the result of comparison operators with NULL
Let’s add an ‘Age‘ column to our Celebrities table, then add some data to it:
(By the way, I’m writing this on April 21, 2021. Ages are probably different if you are reading this in the future).
Knowing this data, let’s think about the following SELECT statement, where we’re filtering on a person’s age:
Of course, we see Elton and Charlize.
And we know why Anne wasn’t returned. Her age is less than 40.
But we need to understand why we didn’t see Bruce Willis or Prince.
It comes down to three valued logic: True, False, or “Unknown“
When SQL is looking at the age of Bruce Willis, it’s basically asking the question, “Is NULL greater than 40?“.
We know from our previous point that NULL is the absence of information, so the answer to our question is “unknown” again.
Folks, comparison operators only return a row if the expression evaluates to TRUE. If an expression evaluates to anything else, the row is NOT returned. To say it plainly, if an expression evaluates to FALSE (like in the case of Anne Hathaway) or “unknown” (like in the case of Bruce Willis and Prince), it ain’t returned.
This is true even in the following comparison: NULL = NULL
To be clear, when we say “NULL = NULL“, we’re asking the question “Is NULL equivalent to NULL?“.
Any guesses to the answer?
It’s still NULL, a.k.a. “unknown”
If you didn’t know that, you might be confused by the following query:
You might be saying “Wait, we just went over how Bruce Willis and Price have NULL as their Age. So why didn’t they get returned?“.
Because folks, when you compare NULL to anything, the result is NULL. This is true even if you compare NULL to another NULL.
Rows get returned only if the expression evaluates to TRUE.
So then the question becomes, “If we wanted to return rows where the value was specifically NULL, how would we do it?”
Answer: Use the IS NULL predicate:
Nice.
3. If you negate NULL, you still get NULL
Let’s talk about the opposite end of that coin. Are you surprised this query doesn’t return anything either?
Again, there is no concept of “not unknown“. The opposite of “unknown” is still “unknown” in the eyes of SQL Server.
So if we wanted to see rows that specifically were not NULL, we would need to use the IS NOT NULL predicate:
Very cool.
4. Not specifying nullability in you column definitions
When you define columns in a table, you should be explicit about whether or not you want the column to allow NULL. If you don’t specify the nullability of a column, the nullability defaults to allow NULL.
When you explicitly outline the nullability of your columns, it really forces you to think about whether or not you want the column to allow NULLs from a business perspective.
Let’s take a look the CREATE TABLE statement for our ‘Celebrity‘ table (from before adding the Age column):
So we were explicit, outlining whether or not we want our columns to allow NULL.
Think about the nullability of your columns when you create them.
The CelebID and FirstName columns are NOT NULL. From a business perspective, we want all our celebrities to have an ID and a first name.
The ID is the Primary Key, meaning there needs to be a value here (in fact, when you make a column the primary key, it’s nullability is set to NOT NULL automatically (but again, I like to be explicit, so I outlined NOT NULL anyway).
Also, notice we used the IDENTITY property to automatically generate key values for us. Not sure how the IDENTITY property works? Check out the full tutorial:
The IDENTITY Property: Everything you need to know
And everyone has a first name. Even people like Prince, who’s first name is his entire name.
But as we discussed, not everyone has a middle name. And again like Prince, not everyone has a last name. So from a business perspective, we say it’s okay if these columns contain NULL.
This is the kind of deliberate thought you need to have when defining properties of your tables and columns. If we didn’t put this kind of thought in place, and plainly left out the nullability of our columns, they all would have defaulted to allow NULLÂ (except the primary key column, which defaults to NOT NULL as I’ve said).
Making theses decisions early will help you avoid changing things later, when it will likely be more difficult!
5. You should be accounting for NULL in your query results
Let’s go back to this query, where we return all employees whose Age is not NULL:
Notice for Anne and Charlize, their MiddleName is NULL.
Maybe the end user of this query doesn’t want to see “NULL”. Maybe “NULL” is ugly, or maybe the end user isn’t even sure what “NULL” really means.
What if instead of NULL, we replaced it with the phrase ‘N/A‘? This is the universal acronym for ‘Not Applicable‘. Maybe that would be much cleaner for the end user. What do you think?:
That’s better! Or maybe saying the phrase ‘unknown‘?:
That’s good, too!
“Ok, I get it, but how are you doing this?”
Answer: Using the ISNULL() system function.
The ISNULL() System Function
ISNULL() is used when you want to replace a NULL in a column with a different value.
The syntax is this:Â ISNULL(<expression>, <replacer_value>)
Here is ISNULL() in action:
Here’s how ISNULL works: If the value in the column outlined in <expression> is NULL, the value get’s replaced with <replacer_value>.
Otherwise, if the value in the column outlined in <expression> is not NULL, the value itself gets returned.
So for Elton John, the value in the MiddleName column is not NULL. The value is ‘Hercules‘. So we return that value, ‘Hercules‘.
However, for Anne Hathaway and Charlize Theron, the MiddleName is NULL. So for them, we return our replacer value which is just a question mark, ‘?‘.
As you have seen, our replacer value can be whatever we want. But of course, you want it to make sense. Understand also, sometimes you’ll want your replacer value to align with certain business requirements.
For example, say you have a flag column called ‘IsFeatureEnabled‘. This flag controls whether or not a certain feature can be used within an application. Normally, valid valued for this column are only ‘Y‘ or ‘N‘.
However, if (somehow) the column got set to NULL, maybe you could use ISNULL to return ‘N‘ by default. In other words, if the state of this column is unknown, assume the feature is disabled and therefore cannot be used.
6. Set Operators are distinctness based, which treat NULL differently
For this point, I went ahead and made two separate tables called ‘Actors‘ and ‘Singers‘, and split our celebrities up according to which they are most known for. Those tables look like this:
Notice I added Jared Leto to each table. He is a very talented actor, and he is the singer in a very popular band. He gets to be part of both tables.
What if in a single query, you wanted to see the people who intersect both tables? Just by looking, it appears Jared Leto would be the only one. Let’s see what the query looks like:
By the way, need a rundown on the INTERSECT set operator, as well as other set operators available to us in SQL Server? I got you:
SQL Server Set Operators: The Ultimate Guide
So let’s think about why this row was returned. SQL Server is asking the following question when looking at Jared Leto in both tables:
“Is Jared NULL Leto in the Actors table equivalent to Jared NULL Leto in the Singers table?”
As pointed out in grey, how does SQL Server compare the two NULL middle names?
SQL Server uses two valued logic when it comes to Set Operators: True or False.
If you were thinking in terms of three-valued logic (TRUE, FALSE, or “unknown“), you might think the answer to the question of “is NULL equivalent to NULL” would yield a value of “unknown” because that’s what we talked about earlier.
But that’s if you’re thinking in terms of three-valued logic.
When it comes to set operators, SQL Server does not think in terms of three-valued logic. It thinks in terms of distinctness.
With distinctness based comparisons, we go back to using the more traditional two valued logic (TRUE or FALSE).
When it comes the distinctness based comparisons, we need to throw the “unknown” thing out the window. With distinctness based comparisons, you only have two results of a comparison: TRUE or FALSE.
So with distinctness based comparisons, if you ask the question “Is NULL equivalent to NULL?“, the answer is TRUE!
This is why we see Jared Leto in the final result set. Both tables contain the name Jared NULL Leto.
7. All aggregate functions ignore NULLs (except one…)
For this point, I added a ‘KnownFor‘ column to our ‘Celebrities‘ table:
(I left out Jared Leto, we don’t need to worry about him anymore)
Let’s do a quick GROUP BY query to see how many Singers and Actors we have in our table:
Ok, that was easy enough. But what if we changed the query from using COUNT(*) to using COUNT(MiddleName):
Hmmm, there is a difference in the number of Actors. But why?
COUNT(*) is different from COUNT(<column name>)
COUNT(*) will count rows that contain NULL in any of the columns. It doesn’t look at a specific column to count, it just counts the number of rows for the item we are grouping by.
COUNT(<column name>), on the other hand, won’t count a row if the column name we are counting contains NULL for that row. In other words, it ignores NULL.Â
Since all our actors have NULLÂ as their middle name, they are NOT counted, leaving us with a final count of 0.
I wrote an entire tutorial on The difference between COUNT(*) and COUNT(column). Make sure you check it out.
We still see a count of 2 for singers. Elton John has a middle name, and Prince has a middle name that isn’t NULL. So they both get counted.
If we wanted to see a list of celebrities who very plainly have a middle name, we could write a query like this:
You’re not confused why we don’t have an Actor group anymore, are you?
When we added the filter for MiddleName != ”, we immediately excluded any rows that have NULL as the MiddleName (and of course any middle names that are blank).
Remember, comparing NULL to anything yields a result of “unknown”, even if we compare NULL to blank! “Unknown” rows get discarded, remember?
Maybe this will make it clear:
NULL middle names get discarded, as well as blank middle names, leaving us with just 1 row to perform the aggregation on.
But back to the point…
The only aggregate function that doesn’t ignore NULL is COUNT(*). The rest of them all ignore NULL when doing their aggregation. If you think about it, this makes sense. COUNT(*) doesn’t really have a particular column to check for NULLs in. It just strictly counts rows, regardless of column values.
All the other aggregates need a particular column passed in. So if that particular column contains NULL for a row, that row is ignored!
Next Steps:
Leave a comment if you found this tutorial helpful!
Don’t forget to download your FREE 1-PAGE Simple SQL Cheat Sheet on SQL Server NULL!
This guide contains all the key points you need to know about NULL condensed into a simple 1-page guide. It will be an excellent resource for you to reference during your career as a database professional. Make sure you download it today!
There is also a great book called T-SQL Fundamentals written by Itzik Ben-Gan that goes over several core concepts you should know about SQL Server, including everything you need to know about NULL. You won’t regret owning this book, trust me. Definitely get it today!
Also, be sure to check out the full tutorial on the GROUP BY clause!
This tutorial gives you a full rundown on how the GROUP BY clause works, and discusses some of the most common aggregate functions you should know.
Made it this far into the post? Good. You get the privilege of knowing this factoid: Charlize Theron doesn’t have a middle name 🙂
Thank you very much for reading!
Make sure you subscribe to my newsletter to receive special offers and notifications anytime a new tutorial is released!
If you have any questions, please leave a comment. Or better yet, send me an email!