Now I’ll admit that I didn’t have any plan to write a blog post before I got the challenge as I’ve been steadily getting busier with work/study/other commitments, so apologies if it’s a bit long, rambling, and not thought through fully.
Anyway, happy birthday Arun, here’s your blog post:
Understanding “Scan Counts 0, Logical Reads N”
Have you ever run
SET STATISTICS IO ON; ?
I’ll confess that I do it a lot, especially when I am performance tuning. Yet, like most things in SQL Server, I don’t fully understand it…yet!
Now don’t get me wrong, the little that I understand is extremely helpful, but recently I had a case where I didn’t understand the output of
STATISTICS IO , and asking my Senior DBA got me the look from him that I come to think of as ‘The You_Are_Either_Joking_Or_You_Are_Being_Stupid_Again Look’.
So to document my stupidity, here’s the question for that look.
How come the Logical Reads from STATISTICS IO are so high when it says Scan count is 0?
tl;dr – They are related but not exactly a 1:1 relationship. Plus a scan count of 0 does not mean that the object wasn’t used at all.
Let us just get this definition out of the way as it’s very short, sweet, and to the point. Logical Reads are the…
Number of pages read from the data cache.
Right, great, got’cha. Logical reads, 8kb pages, read from the data cache. If your
STATISTICS IO reports logical reads 112 then you’ve read 112 pages from the cache. Simples!
This is the blighter that got me the look…well more like ‘my misunderstanding of what Scan Count means’ got me the look but it still holds my contempt at this moment in time.
My previous intuitions about this guy was…
“Scan count is the number of hits the table/index received”
(THIS IS NOT CORRECT! …and why it isn’t down as a full quote).
Let’s check out the definition again and see what it has to say for itself.
Scan count is the…
Number of seeks/scans started after reaching the leaf level in any direction to retrieve all the values to construct the final dataset for the output.
That’s a very specific definition isn’t it? It’s not all of the definition though, there’s more! And it is this “more” that I want to focus on.
Testing The Defintions
First, things first, let us set up our query environment.
USE [tempdb]; GO SET STATISTICS IO ON; GO
Scan count is 0 if the index used is a unique index or clustered index on a primary key and you are seeking for only one value. For example
WHERE Primary_Key_Column = <value>.
Well let’s see about that!
CREATE TABLE dbo.Unique_DefinedUnique ( col1 INT NOT NULL PRIMARY KEY ); INSERT INTO dbo.Unique_DefinedUnique (col1) SELECT x.n FROM (VALUES (1), (2), (3), (4), (5)) AS x(n); CREATE UNIQUE NONCLUSTERED INDEX uci_Unique_DefinedUnique_col1 ON dbo.Unique_DefinedUnique ( col1 ) GO SELECT col1 FROM dbo.Unique_DefinedUnique WHERE col1 = 1;
Never mind, the next paragraph please!
Scant count is 1 when you are searching for one value using a non-unique clustered index which is defined on a non-primary key column. This is done to check for duplicate values for the key value that you are searching for. For example
WHERE Clustered_Index_Key_Column = <value>.
I don’t believe you!
CREATE TABLE dbo.Unique_NotDefinedUnique ( col2 INT NOT NULL ); GO INSERT INTO dbo.Unique_NotDefinedUnique (col2) VALUES (1), (2), (3), (4), (5); CREATE NONCLUSTERED INDEX nci_Unique_NotDefinedUnique_col2 ON dbo.Unique_NotDefinedUnique ( col2 ) GO SELECT col2 FROM dbo.Unique_NotDefinedUnique WHERE col2 = 1;
Scan count is N when N is the number of different seek/scan started towards the left or right side at the leaf level after locating a key value using the index key.
Hmmm, so if we have duplicate values, then this will happen?…
CREATE TABLE dbo.NotUnique ( col3 CHAR(1) NOT NULL ); GO INSERT INTO dbo.NotUnique (col3) VALUES ('A'), ('A'), ('B'), ('B'), ('C'), ('C'), ('D'), ('D'), ('E'), ('E'), ('F'), ('F'), ('G'), ('G'), ('H'), ('H'), ('I'), ('I'), ('J'), ('J'); GO 2 CREATE NONCLUSTERED INDEX nci_NotUnique_col3 ON dbo.NotUnique ( col3 ) GO --Let's try it with 2 and then 3!!! SELECT col3 FROM dbo.NotUnique WHERE col3 = 'A' OR col3 = 'B' SELECT col3 FROM dbo.NotUnique WHERE col3 = 'A' OR col3 = 'B' OR col3 = 'C'
Putting away my toys…
DROP TABLE dbo.Unique_DefinedUnique, dbo.Unique_NotDefinedUnique, dbo.NotUnique;
I was confused about the Scan count being 0 but logical reads not being 0. How can the scan count not actually scan/seek anything?
But it is scanning/seeking!
Read the Scan Count definition again…I’ll capitalize the words that I glossed over
Number of seeks/scans STARTED AFTER REACHING THE LEAF LEVEL in any direction to retrieve all the values to construct the final dataset for the output.
Scan count of 0 occurs when there is a unique index or clustered index on a primary key and you are seeking for only one value. The word of the day is…”Unique”.
So because the unique index looking for a unique value in a column guaranteed to be unique, it’s not so much that the query isn’t looking for a value, it’s more that once the query reaches the leaf level it already knows that it’s on the value it needs!
Since it doesn’t look any more “after reaching the leaf level”, the scan count is allowed to be 0.
This explains why, if the value is unique but not guaranteed to be so (either the index, value, or column is not guaranteed unique) the query has to do 1 scan/seek to check that the next value isn’t what it wants.
Therefore, Scan Count will be 1…
And I’ll leave it as an exercise to figure out why Scan Count N is Scan Count N.
(Hint: it’s because Scan Count N)
I sometimes find the documentation dry and not terribly interesting but I don’t like the “You_Are_Either_Joking_Or_You_Are_Being_Stupid_Again” look more.
So read the documentation, study what you are doing, know why things do what they do…
It’ll save you some funny looks off of people. 🙂