Taking advantage of Sargability

Facebooktwitterredditpinterestlinkedinmail

You’ve read about Scalar-Valued functions and their RBAR(row by agonizing row) cost. You’ve also read about how combining recordsets can be costly. Now, we are going to discuss SARGability and how avoid statements that take the fun out of the seek.

Searching! Seek and Destroy!

Ok, so we’re not going to seek and destroy, but I am a HUGE Metallica fan. Every time I think of SARGability when writing or correcting a query, this song pops into my head.

Let’s define the term SARGable. Being SARGable means that the query is able to take advantage of the RDBMS indexes. As DBAs, we do our best to write the queries to take advantage of the indexes as best we can. In this article, we are going to explore some of the malpractices that mitigate the proper indexes.

Like Totally!

In SQL Server, the like determines whether a specific character string matches a specified pattern. A pattern can include regular characters and wildcard characters. What does this mean? This means that you do not have to know the entire string to find the result. For example, I can get a result from the following query without the entire name.

select name from band where name like '%talli%'

So, why would this be a bad thing? When running this query, SQL Server is forced to scan the index. Let’s use an AdventureWorks2012 example.

SELECT sod.SalesOrderid, sod.orderqty, sod.unitprice
FROM Sales.SalesOrderDetailEnlarged sod
WHERE sod.productid like '747'

The query seems pretty simple. Why are we using like? We are looking for a specific productID with no wild cards. Surprisingly, this is a common occurrence. Let’s take a look at the execution plan and the statistics IO.

like_nonsargable

/************************************
** BEGIN "Like"
************************************/

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'SalesOrderDetailEnlarged'. Scan count 4, logical reads 50271, physical reads 0, read-ahead reads 50092, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 2418 ms, elapsed time = 9358 ms.
/************************************
** END "Like"
************************************/

In the execution plan, we see that the index is being scanned. For a small index, this may not be much of an issue, but this will not scale well. Notice the high about of logical reacs and read-ahead reads. Both are around 50000. An elapsed time of almost 10 seconds. For 35000 records, this is quite extreme.

Now lets review the proper query.

SELECT sod.SalesOrderid, sod.orderqty, sod.unitprice
FROM Sales.SalesOrderDetailEnlarged sod
WHERE sod.productid = 747

This is how the query would be written properly. We could also put the value into single quotes for an implicit conversion. This will not alter how SQL Server executes the query. Now, let’s review the improvement.

nolike_sargable

/************************************
** BEGIN No "Like"
************************************/
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'SalesOrderDetailEnlarged'. Scan count 1, logical reads 124, physical reads 2, read-ahead reads 120, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 367 ms.
/************************************
** END No "Like"
************************************/

We have gone from roughly 100,000 reads(logical and read-ahead) to 246 total reads(124 logical, 2 physical, 120 read-ahead). The execution plan does what we would like to see, and index seek.

The House that Jack Built

Next, we are going to take a look at built in functions. Many times, we like to use SQL Server’s built in functions to limit our results, such as looking for the year of a date.

SQL Server is a little bit tricky when analyzing built-in functions. The execution plan does not reflect them, and they do not show on a trace. See the execution plan below, on the surface we do not see a reference to the function.

built_in_function_plan

What we do see is 91% of the cost is for an index scan. This still doesn’t show the whole story. Let’s review the stats IO/time.

/************************************
** BEGIN Built-in function
************************************/

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'SalesOrderHeaderEnlarged'. Scan count 4, logical reads 14067, physical reads 0, read-ahead reads 13976, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 1373 ms, elapsed time = 5556 ms.
/************************************
** END Built-in function
************************************/

14067 logical reads, 13976 read-ahead reads and an elapsed time of 5.5 seconds. This is certainly not what we are looking for. Remember the RBAR mentioned earlier, we are beginning to see it defined.

Scalar-Valued Function as a search

As you recall, the first part of this series described the drawbacks of using Scalar-Valued functions as part of a select statement. Another common practice is to use Scalar-Valued Functions as part of a from/where clause to limit results. In this section, we will show the “agony” of RBAR and Scalar-Valued Functions in the where clause.

Let’s examine the query below. We have the Scalar-Valued Funtion in both the SELECT clause and the WHERE clause. We’re saving code, right?

select s.OrderDate, s.SalesOrderNumber, s.PurchaseOrderNumber, s.SalesOrderID, dbo.ufnSalesOrderQuantity(s.SalesOrderID) as SalesOrderQuantity
from Sales.SalesOrderHeaderEnlarged s
Where TerritoryID = 6
and dbo.ufnSalesOrderQuantity(s.SalesOrderID) > 12

Now, if you remember from the first blog of this series, the cost of the scalar valued function is not reflected in the execution plan of this query. So, for this, I have ran a trace while running the query. Shown below are the time/IO statistics for the query above as well as the execution plan.

/************************************
** BEGIN Function
************************************/

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

(476 row(s) affected)
Table 'SalesOrderHeaderEnlarged'. Scan count 1, logical reads 2925, physical reads 2, read-ahead reads 2923, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 50950 ms, elapsed time = 118950 ms.

/************************************
** END Function
************************************/

scv_nonsargable_left

In the execution plan, you will see an index seek. This is what you would think you would want. The execution plan does not show the dark side of the story. To show this, I have ran a trace. You will notice that to obtain 476 records, the function was called 483973 times and ran for nearly 2 minutes. Total reads against the database, 1482949!

scv_nonsargable_left_tr

scv_sargable_ep

Conclusion

SARGability is a term that essentially is referring to the database getting results as efficiently as possible. What we have shown throughout this post is how we make the database work much harder than it needs to. Use “LIKE”, built-in functions and scalar-valued functions sparingly. Fortunately by simply NOT implementing non-sargable code practices unnecessarily will save us many headaches in the future.

Coming up: The CTE problem

Facebooktwitterredditpinterestlinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *