SQL Saturday #517 Philadelphia

I am honored to be accepted to speak and return Philadelphia, PA on Saturday, June 04, 2016.  Along with seeing great #sqlfamily and being able to share what I can from a hopefully unique perspective.

I will be discussing Common T-SQL Coding mistakes and referencing SARGability and RBAR. As database code tuners, we may get frustrated with some programmatic techniques from application developers.

I hurts me to say, as a Pittsburgh Pirates fan, that I will even work the Phillies into this. I hope to see you there!

If you have not registered, do so NOW!

SQLSat517_Austin

 

 

 

 

Share

SQL Saturday #521 Atlanta

I am honored to be accepted to speak and return Atlanta, GA on Saturday, May 21, 2016.  Along with seeing great #sqlfamily and being able to share what I can from a hopefully unique perspective.

Atlanta will have a plethora of great minds to learn from. This is possibly the biggest SQL Saturday on the east coast

I will be discussing Locks, Blocks and Deadlocks from the perspective of a father and family man. The SQL Server engine acts like a family manager, a parent if you will. Hopefully relating locks, blocks and deadlocks in this fashion will help you understand this concept. I hope to see you there!

If you have not registered, do so NOW!

SQLSat521_Austin

 

 

 

 

Share

SQL Saturday #486 Richmond

I am honored to be accepted to speak and return Richmond, VA on Saturday, March 19, 2016.  Along with seeing great #sqlfamily and being able to share what I can from a hopefully unique perspective.

I will be discussing Service Broker from the perspective of coaching/playing football.  Let’s see how we can relate Service Broker to Washington Redskins Football and see how we can handle them Cowboys.  I hope to see you there!

If you have not registered, do so NOW!

SQLSat486_Austin

 

 

 

 

Share

SQL Saturday #461 Austin

I am honored to be accepted to speak in Austin, TX for the first time on Saturday, January 30, 2016.  Along with seeing great #sqlfamily and being able to share what I can from a hopefully unique perspective, I’m excited to see Austin as a hub of technology!

I will be discussing Service Broker from the perspective of coaching/playing football.  Let’s see how we can relate Service Broker to Texas Longhorn Football and see how we can handle them Sooners.  I hope to see you there!

If you have not registered, do so NOW!

SQLSat461_Austin

 

 

 

 

Share

SQL Saturday #480 Nashville

I am honored to be accepted to speak in Nashville, TN for the first time on Saturday, January 16, 2016.  Along with seeing great #sqlfamily and being able to share what I can from a hopefully unique perspective, I’m excited to see Nashville as a country music fan.

I will be discussing Locks, Blocks and Deadlocks from the perspective of a father and family man.  The SQL Server engine acts like a family manager, a parent if you will.  Hopefully relating locks, blocks and deadlocks in this fashion will help you understand this concept.  I hope to see you there!

If you have not registered, do so NOW!

SQLSat480_Nash

 

 

 

 

Share
Deadlocking - Deciding which child gives up what they want

Deadlocking – Deciding which child gives up what they want

As we have spoken about before, parents of siblings or the siblings themselves understand what it is like for a parent to have to resolve a squabble.  Recently, my daughters were having burgers for dinner.  My youngest was sitting at the table with the ketchup and my eldest had both hers and her sister’s burger.  The youngest said she wanted her burger, the eldest wanted the ketchup.  Neither would give the other what they wanted.

Of course, being the Dad that I am, I attempted to teach a little DBA deadlocking to them.  They both told me that I was speaking Martian.

This is essentially a deadlock in real life.  In SQL Server, or any RDBMS, deadlocks are NOT normal and should not be accepted as such.

Deadlock Defined

A deadlock occurs when there are two separate processes with separate transactions.  Each transaction is holding a resource the other needs.  Lets look at this scenario.  Sister_One has a lock on Tablet_One and has requested a lock on Headphone_Two.  Sister_Two has a lock on Headphone_Two and has requested a lock on Tablet_One.  Both processes are waiting for the other one.  The SQL Server Database Engine(aka. The Parent) must choose which process to be the deadlock victim as neither task can continue.

The Database Engine will check for deadlocks in intervals of 5 seconds as a default.  The transaction that is chosen as the victim first by deadlock priority.  The DEADLOCK_PRIORITY may be set on the session explicity.  In this case, the Database Engine will choose the victim with the lowest DEADLOCK_PRIORITY.  Alternatively, the victim is the one that SQL Server determines is the least expensive to rollback by default if the deadlock priority was not set or both objects have the same priority.

Reporting on Deadlocks

There are various ways to report on deadlocks.  The simplest is the using the trace flag 1222.  This will record the deadlock in the SQL Server Error Log in XML format.  To turn the trace on or off, use the following.  1222 is a global only flag.

Once the trace flag is set, when a deadlock occurs, you can read from the SQL Server Error Log using the following:

Another way to report on deadlocks is to use the default extended event from the ring buffer.  I tend to avoid the ring buffer as I find it less than dependable.

@SQLWAREWOLF Deadlock Reporting

I have created a full deadlock reporting structure that I would like to share.  This page and script have had some feedback by Mr. Wayne Sheffield(b|t).  Thank you so much my friend.

This process will use a defined extended event to record deadlocks to a configurable disk location.  First of all, the script uses a database called DBA.  In any instance, the DBA should have a database for which they can store data/objects specifically for database administration.

Three tables are created; general_configuration, deadlock_report_main and deadlock_report_detail.  A function is created to check for the existence of the extended events files that are configured in general configuration.

An explicit extented event is created to track and store deadlock information into an xml file structure.  This is defined and the location of the files is based on the value configured in the general_configuration table.

There are a few stored procedures created:

  • uspRecordDeadlocks – Reads and parses the xml into Deadlock_Report_Main and Deadlock_Report_Detail.  This will record the deadlock winner and victim.  This will also record the procedure, query, line numbers and various other information.
  • uspAnalyzeDeadlocks – This reports on the recorded data.  Paring the winner and loser combinations.
  • uspAnalyzeDeadlocks_Obfuscate – within uspAnalyzeDeadlocks, there is an optional parameter to obfuscate parameters and values within where clauses, join statements, or parameters passed to a stored procedure.
  • uspCleanDeadlockHistory – This is used to clean the reporting tables so that the data is kept trim.

The Record_Deadlocks job will read the events and parse the XML into a reporting structure by calling uspRecordDeadlocks.  This will also trim the history by calling uspCleanDeadlockHistory.

The final report example is below:

 

Analyze_Deadlocks_Results

THIS IS HAS NOW BEEN UPDATED TO VERSION 1.1,  BUT STILL MAY BE BUGGY! To be clear, these scripts and procedures are written by myself and may be shared with that copyright knowledge.  They are AS IS and no warranty or support is included.  I am NOT liable for ANY damages using these scripts may cause.  Use at your own risk and always test THOROUGHLY on a development system before implementing in production.

Resolving Deadlocks

What do you do now that you’ve found your deadlocks parings?  You resolve them.  The best way to make deadlocks “not happen” is to make the query faster.  In my experience(this may not always be true), most deadlocks occur when there is inefficient indexing, if any at all, or poorly written queries.

  • Make sure that your queries are properly tuned.
  • Use the appropriate indexes as necessary.
  • Use efficient code(See my common coding mistakes series).
  • Run transactions in small batches.
  • Use a lower isolation level such as Read_Committed which holds locks for a shorter duration.
  • Use Snapshot Read Committed Isolation for row versioning.
    • Remember that this comes at the cost of high tempdb usage.

 

Conclusion

Applying an efficient thought process to SQL Server and processing data quickly will hopefully avoid deadlocks.  Unfortunately, deadlocks do occur.  Now we understand what deadlocks are and how to report on them.  This is the first step in resolving deadlocks.

As a parent of siblings or as a sibling, you are more experienced than you may realize in resolving deadlocks.  You’ve had much practice.

Resources:

https://technet.microsoft.com/en-us/library/ms178104(v=sql.105).aspx

Share

Locking and Blocking, Family Management.

For any parents out there of multiple children, you are acutely aware of locking and blocking in the real world.  Who gets to use the bathroom first?  Hearing the phrase “I saw that first”.  I want that.  Fighting over a toy.  Each parent has their own methods of how to resolve these disputes.

Just as in a home with siblings, locking and blocking are absolutely a part of life in SQL Server and most RDBMS. Understanding how locking and blocking occurs and then minimizing the affects in accessing our precious information is a high priority.

Database Engines treat the data as children in this respect.  Keeping the data as orderly as possible, without too much fighting.

First let’s take a look at some basic principles.

ACID

ACID is an acronym that stands for Atomicity Consistency Isolation Durability. This is a principle that Database professionals will look to when evaluating basics of a database. This basically means that the data can be trusted.

  • Atomicity – A transaction is an entire unit of work. All or NONE of the transaction is committed.
  • Consistency – All of the data is in a state of equilibrium. Indexes, materialized views and anything else affected by the transaction is consistent and in a zen state.
  • Isolation – The idea that the data modifications are left alone until they are finished. Transactions are kept away from each other. Like two children being sent to their rooms to clean.
  • Durability – The RDBMS keeps track of these changes and that after the transaction is complete, that data is recorded in the database permanently and can be recovered in the event of a system failure.

 

Isolation Levels

In SQL Server, and other RDBMS, the engine implements the configured isolation level which will affect the ACIDity when reading data. The isolation level will isolate the process and protect it from other transactions. To do this, locking is required. The locking is determined by the isolation level. Low levels of isolation will give high levels of concurrency and shared use of the data, but can result in lower levels of consistency with dirty or phantom reads.

SQL Server will use isolation levels to command the read behavior of the database/transaction.

  • Read Uncommited
    • Read data from other transactions yet completed
    • Dirty Reads
    • Uncommitted updates
  • Read Committed
    • SQL Server Default
    • Prevents reading Data from uncommitted transactions
    • Can result in Phantom Reads/Repeatable Reads
  • Repeatable Read
    • This will ensure that if data is reread within a transaction, that the data does not change
    • No transaction can modify the data until the transaction reading the data is complete
    • This comes at the cost that all shared locks in the read will hold until the transaction is completed
    • This will eliminate Dirty and Non-Repeatable reads, but Phantom reads may still occur.
  • Serializable
    • This is putting the transactions in a single line.
    • This is the most consistent and best way to ensure for ACIDity.
    • A read/write lock will be issued for all of the records affected by the transaction
    • This includes Repeatable Read isolation and will also eliminate Phantom reads.
  • Snapshot
    • This will read the data as it is at the beginning of the transaction.
    • This will come at a high cost to tempdb as the data for the snapshot is stored in tempdb.
    • This will eliminate Dirty, Phantom, Non-Repeatable Read and Lost updates

Lower isolation levels offer greater concurrency, meaning more connections may view the data, but the following concurrency affects may occur:

  • Lost Updates – Transaction A changes a value, then commits. Transaction B then also changes that value and commits. Transaction A then reads the data after Transaction B commits. Transaction A is expecting the first value, but obtains the second
  • Dirty Reads – Reading data when another process is changing the data and the original process is reading uncommitted data.
  • Non-repeatable Reads – This occurs when a process reads the same data multiple times within a session. Between reads of the data, another process may change the data and therefore different values are returned.
  • Phantom Reads – This is similar to Non-Repeatable reads, but instead of changing(updating) the data, another process adds or removes records between the reads of the same data during a process.

Locking Explained

Locking is a mechanism used by the SQL Server Database Engine to synchronize access by multiple users to the same piece of data at the same time.  Locks are managed internally by a part of the Database Engine called the lock manager.

Each transaction will lock levels of resources such as rows, pages or tables for which the transaction is dependent.  The transaction will release the lock when it either commits or rolls back.  The amount of time the lock is held is dependent on the isolation level.

The two basic types of locks are read locks and write locks.

  • Read Locks
    • Shared –While a shared lock is being held other transactions can read but cannot modify locked data.  The lock is released after the locked data has been read unless the isolation level is at or higher than Repeatable Read or a locking hint such as READCOMMITED or READCOMMITTEDLOCK is used.
    • Concurrent – This is when you read data using read data held by exclusive lock only by specifying a NOLOCK hint or using a read uncommitted isolation level.
  • Write Locks
    • Update – Update locks are a mix of shared and exclusive locks. When an update statement is executed, SQL Server has to find the data it wants to modify first, so to avoid lock conversion deadlocks an update lock is used. Only one update lock can be held on the data at one time, similar to an exclusive lock. The update lock itself can’t modify the underlying data, when the data is modified, it is transferred to an exclusive lock.
    • Exclusive – Exclusive locks are used to lock data being modified by one transaction thus preventing modifications by other concurrent transactions.  A transaction always gets an exclusive lock on any data it modifies, and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.

 

Lock Granularity/Escalation

Lock Escalations are an optimization technique used by SQL Server to control the amount of locks that are held within the Lock Manager of SQL Server.

Lock granularity consists of DATABASE, TABLE, PAGE and ROW locks.  As soon as you connect to a SQL Server database, a shared lock is issued on the database level itself. This prevents another user from performing an action such as “DROP DATABASE” while there is a connection to it.

SQL Server will perform it’s lock granularity from top down. First, it will check if a table lock is necessary, then page then row. An Intent(Shared or Exclusive), dependent if the operation is read or write, is issued on the table and page and a shared lock on the row. If the amount of data necessary is more than a row and on a page. Once SQL Server needs more than 5,000 row locks, it will escalate(by default) to a table lock.

You can alter these defaults by either using query hints such as ROWLOCK, PAGLOCK or TABLOCK. You can also alter the lock escalation of each table by using the following:


This is generally not recommended as it is best to allow the Database Engine to escalate locks accordingly. Lower level locks increase concurrency, but consume more memory.

 

Fun Scripts

So, we now know that locking and blocking are part of the natural occurrence of processing transactions.  When it comes to blocking the objective is to lessen the affects and hold the blocks as little time as possible.  Below, you will find two scripts.  The first will list was locking is occurring within your database.  The second will list your current processes and list the ones blocking first.  These may help in ascertaining what issues need to be addressed regarding blocking.

Current processes, showing blocking first

 

 Locks currently held within the database

 

Coming up: Deadlocks – Who gets the toy?

References:

http://www.sqlteam.com/article/introduction-to-locking-in-sql-server#sthash.ujDqLUPK.dpuf

http://blogs.msdn.com/b/sqlcat/archive/2011/02/20/concurrency-series-basics-of-transaction-isolation-levels.aspx

https://technet.microsoft.com/en-us/library/jj856598(v=sql.110).aspx

Share

Overuse of Common Table Expressions


Let’s explore common table expressions.  Like we have shown with scalar-valued functions, the CTE may be useful, but can quickly hinder performance if not used responsibly.

What is a CTE?

Microsoft’s definition of a CTE: Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement. This clause can also be used in a CREATE VIEW statement as part of its defining SELECT statement. A common table expression can include references to itself. This is referred to as a recursive common table expression.

What does that mean?

A CTE is essentially a defined derived table.  Derived tables are the tables which are created on the fly with the help of the Select statement.  When we defined the derived table as a CTE, it makes the final query certainly more readable.

Defining this as a CTE has one strong advantage.  The CTE has the ability to reference itself, also known as recursion.  This is certainly a powerful tool when attempting to find the results of a parent-child relationship.

For the purposes of this post, I will recognize that recursion is an advantage but the performance issues that we will address the common practice of using a CTE without the need for recursion.

Comparing CTE to Derived Table

Using the AdventureWorks2012 database, let’s take a look at the execution plan for a simple CTE querying the SalesOrderHeaderEnlarged table.

simple_cte_execution_plan

Now, let’s take a look at the execution plan for the same query written as a derived table.

 

simple_dt_execution_plan

Notice how both queries give you the same execution plan.  A CTE really is a definable derived table that may be used multiple times throughout the query.  This is the double edged sword.  Although the query may be more readable, and you can reference the CTE repeatedly, the end result is referencing the same derived table multiple times.

CTE Example

Above I have shown a simple CTE.  The SalesOrderHeaderEnlarged table is called once.  Defined like that, with smaller recordsets, a CTE is harmless.  In a situation like this, if necessary, the CTE can be used to query itself.  What happens if the CTE is referenced multiple times?  Let’s take a look at a more complex CTE.  In this query, we are looking for the net sales, order quantity, sales quota, quota differential and average quantity per sale.  We are using three CTEs.  The second and third reference the previous two.  The final query looks like a relatively simply join with four objects in the from clause.

See the statistics below.  To get 17 records in results, the duration was nearly 4 seconds with 72194 logical reads.

Text Data Duration CPU Logical Reads
CTE 3956.8800000 8880 72194

Let’s take a look at the statistics IO breakdown.  Notice how the majority of the 72194 reads come from the SalesOrderDetailEnlarged and SalesOrderHeaderEnlarged tables.

 

Below we are showing a cross section of the execution plan, along with other offenders, I wanted to show the biggest.  Notice how SalesOrderDetailEnlarged and SalesOrderHeaderEnlarged are being queried more than once in the same manor for the same data.  This is showing you the cause for the high reads.  Our goal as database programmers is to get the data with the database doing as little work as possible.  In this case, that goal is not being met.

CTE_CTE_Execution_Plan_001

 

Derived Table in place of the CTE

In this case, I have altered the CTE query to use the derived table directly rather than the CTE.   Let’s take a look at the code.  Certainly more lines of code.  We have gone from 43 to 52 lines of code.  I’m aware I use a lot of spacing, but it is consistent.  Even given that, are we really doing more?

The answer is no.  Let’s look at the basic statistics.  The duration is roughly the same, the CPU is roughly the same.  The logical reads are roughly the same and, as you see below, the execution plan is exactly the same.

Text Data Duration CPU Logical Reads
CTE 3956.8800000 8880 72194
Derived Table 3953.0510000 8421 72198

Now, let’s look at the statistics IO output.  Again, the majority of the 72198 reads come from the SalesOrderDetailEnlarged and SalesOrderHeaderEnlarged tables.

 

Let’s take a look at the execution plan.  No, I did not simply copy and reuse the same image.  They are the same execution plan.  Again, we are seeing the worst offenders executed multiple times.  Notice the parallelism as well.  SQL Server has determined that it needs to use more than one core to complete this task.  For a well written query, parallelism is a welcomed advantage.  CTEs do not always translate to a well written query.

CTE_DT_Execution_Plan_001


By rewriting the code as derived tables, we can observe through the source code, the statistics and the execution plan that CTEs are simply derived tables put into a more readable code.

CTE Alternatives

Now that we see what CTE are doing behind the scene, what alternatives do we have?  Each database developer may have different answers, but the most common would be table variables or temporary tables. Of course, this is debatable. We will explore that debate in the future.

Table Variable CTE Alternative

Let’s take a look at the code re-written to use table variables.

 

I’m all about pretty and readable code.  We’re defining the table variables, the columns and data types explicitly.  This allows the next person to see this code have a better understanding of what is going on.

Are there other benefits?  Yes there are.  Let’s take a look at the overall statistics.  Our duration isn’t much less at 3.5 milliseconds, but both our CPU and Logical Reads are less than half of what the CTE/Derived table are.

Text Data Duration CPU Logical Reads
CTE 3956.8800000 8880 72194
Derived Table 3953.0510000 8421 72198
Table Variable 3579.8140000 3573 35972

We can see that as well from the statistics IO output.  You’ll notice two stark differences.  The first is the amount of logical reads on SalesOrderDetailEnlarged and SalesOrderHeaderEnlarged.  Both are roughly 50% of what they were in the CTE/Derived Table version of the query.

If you look closely, there is one more difference.  The Scan Count.  For our main tables, the scan count is one.  For the CTE/Derived Table, the Scan Count is EIGHT.  This scan count shows that parallel processing was determined to be necessary for SQL Server to complete its task.  Table Variables to not allow parallel processing.

Now, let’s see the improvements from the execution plan below in several screenshots.

The most costly is the insert into @totalSales, replacing the CTETotalSales.  The rest also use the table variables to replace the CTE.   The end result is at least a 50% improvement in performance.

CTE_TV_Execution_Plan_001_Into_@totalSales

CTE_TV_Execution_Plan_001_Into_@targetDiff

CTE_TV_Execution_Plan_001_Into_@perUnit

CTE_TV_Execution_Plan_001_Into_FinalSelect

 

 

Temporary Table Alternative

Temporary tables are my go to choice.   The biggest advantages are the ability to create non-unique clustered/non-clustered indexes and the ability to use statistics in the execution plan.  The biggest difference that you will see in the query and execution plan is that table variables to not use parallelism.  I’ve mentioned that parallel execution and query quality are not mutually exclusive.  A well written query can use parallel execution thus taking advantage of the extra power.

Let’s take a look at the query rewritten using temporary tables.  The biggest difference, at least in the code, is the ability to add the non-unique clustered indexes to the table.

You will see that with the temporary table, the CPU and Logical Reads are little higher than Table Variables, but the duration is 50% of the CTE/Derived Table and 57% of the Table Variable.  The reason is in the parallel execution.

Text Data Duration CPU Logical Reads
CTE 3956.8800000 8880 72194
Derived Table 3953.0510000 8421 72198
Table Variable 3579.8140000 3573 35972
Temp Table 2042.6220000 4993 37112

Examining the statistics IO below, you will see the scan counts are higher for the temp tables than the table variables, but still less than the CTE/Derived Table.  The reads for our main tables are roughly the same, but slightly high.  The additional reads come from populating our temporary tables with the indexes.

 

Review the execution plan below.  The glaring difference is the use of parallelism couple with the ability to use the clustered indexes within the temporary tables.

CTE_TT_Execution_Plan_001_Into_#totalSales

CTE_TT_Execution_Plan_001_Into_#targetDiff

CTE_TT_Execution_Plan_001_Into_#perUnit

CTE_TT_Execution_Plan_001_Into_FinalSelect

Conclusion

As I have demonstrated through this series, reuse of code may make life a little easier during development, but may have adverse affects on the performance of the queries.  In this post, we’ve seen that CTE are nothing more than a derived table(with the exception of the recursive ability).  CTE may make coding a bit easier for that final join and the developer does not have to take the extra step to define the tables, but the ending result is not optimal.

Play around with your code to see what will work best for your situation.  The end goal is the make the database do the least amount of work for the desired results.  In my experience, temporary tables provide the best benefit over table variables and certainly over CTE.

More coming soon!

Share
Taking advantage of Sargability

Taking advantage of Sargability

You’ve read about Scalar-Valued functions and their RBAR(row by agonizing row) cost. You’ve also read about how combining recordsets can be costly. Now, we are going to discuss SARGability and how avoid statements that take the fun out of the seek.

Searching! Seek and Destroy!

Ok, so we’re not going to seek and destroy, but I am a HUGE Metallica fan. Every time I think of SARGability when writing or correcting a query, this song pops into my head.

Let’s define the term SARGable. Being SARGable means that the query is able to take advantage of the RDBMS indexes. As DBAs, we do our best to write the queries to take advantage of the indexes as best we can. In this article, we are going to explore some of the malpractices that mitigate the proper indexes.

Like Totally!

In SQL Server, the like determines whether a specific character string matches a specified pattern. A pattern can include regular characters and wildcard characters. What does this mean? This means that you do not have to know the entire string to find the result. For example, I can get a result from the following query without the entire name.

So, why would this be a bad thing? When running this query, SQL Server is forced to scan the index. Let’s use an AdventureWorks2012 example.

The query seems pretty simple. Why are we using like? We are looking for a specific productID with no wild cards. Surprisingly, this is a common occurrence. Let’s take a look at the execution plan and the statistics IO.

like_nonsargable

In the execution plan, we see that the index is being scanned. For a small index, this may not be much of an issue, but this will not scale well. Notice the high about of logical reacs and read-ahead reads. Both are around 50000. An elapsed time of almost 10 seconds. For 35000 records, this is quite extreme.

Now lets review the proper query.

This is how the query would be written properly. We could also put the value into single quotes for an implicit conversion. This will not alter how SQL Server executes the query. Now, let’s review the improvement.

nolike_sargable

We have gone from roughly 100,000 reads(logical and read-ahead) to 246 total reads(124 logical, 2 physical, 120 read-ahead). The execution plan does what we would like to see, and index seek.

The House that Jack Built

Next, we are going to take a look at built in functions. Many times, we like to use SQL Server’s built in functions to limit our results, such as looking for the year of a date.

SQL Server is a little bit tricky when analyzing built-in functions. The execution plan does not reflect them, and they do not show on a trace. See the execution plan below, on the surface we do not see a reference to the function.

built_in_function_plan

What we do see is 91% of the cost is for an index scan. This still doesn’t show the whole story. Let’s review the stats IO/time.

14067 logical reads, 13976 read-ahead reads and an elapsed time of 5.5 seconds. This is certainly not what we are looking for. Remember the RBAR mentioned earlier, we are beginning to see it defined.

Scalar-Valued Function as a search

As you recall, the first part of this series described the drawbacks of using Scalar-Valued functions as part of a select statement. Another common practice is to use Scalar-Valued Functions as part of a from/where clause to limit results. In this section, we will show the “agony” of RBAR and Scalar-Valued Functions in the where clause.

Let’s examine the query below. We have the Scalar-Valued Funtion in both the SELECT clause and the WHERE clause. We’re saving code, right?

Now, if you remember from the first blog of this series, the cost of the scalar valued function is not reflected in the execution plan of this query. So, for this, I have ran a trace while running the query. Shown below are the time/IO statistics for the query above as well as the execution plan.

scv_nonsargable_left

In the execution plan, you will see an index seek. This is what you would think you would want. The execution plan does not show the dark side of the story. To show this, I have ran a trace. You will notice that to obtain 476 records, the function was called 483973 times and ran for nearly 2 minutes. Total reads against the database, 1482949!

scv_nonsargable_left_tr

In this case, we are looking for OrderQty = dbo.ufnSalesOrderQuantity(44774). This is showing that we are looking for the OrderQty to match the same OrderQty as the SalesOrderID of 44774. This seems harmless. Let's see what happens with this in the time/IO statistics below.

scv_nonsargable_right 53.4 MINUTES, nearly an hour to return 12495 records. Why is this happening? Notice in the execution plan, there is no index on the OrderQty column. This forces an index scan rather than a seek. When the scan occurs, SQL Server runs the function for every record in the scan. In this case, blah blah million calls with so and so total reads to the database. Fortunately, with an index, this does not occur. We still make an unnecessary function call, but only once as in this case, the execution plan will perform an index seek.

Let's take a look at how this should be written.

Using a function is not always horrible. Below is a case in which we can make it work. We still do not want to put at as part of the where clause. We may have an index on the OrderQty column, but a change in statistics could cause the execution plan to scan and cause severe scalability issues.

As you will see by the execution plan below(the function is still hidden), there are index seeks for both tables and the query runs in less than a second to return 12495 records.

-->
scv_sargable_ep

Conclusion

SARGability is a term that essentially is referring to the database getting results as efficiently as possible. What we have shown throughout this post is how we make the database work much harder than it needs to. Use “LIKE”, built-in functions and scalar-valued functions sparingly. Fortunately by simply NOT implementing non-sargable code practices unnecessarily will save us many headaches in the future.

Coming up: The CTE problem

Share
The State of the Union

The State of the Union

After reading part 1 of the series, I hope that you have questioned the use of Scalar Valued Functions. Now, let’s pose some more questions. In this section, we are going to review the Union operator benefits and costs.

The Distinct State of the Union

As microsoft puts it, a Union combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. The UNION operation is different from using joins that combine columns from two tables.

Union Operator

Union is a great way to give a unique results set from different queries. A Union specifies that multiple result sets are to be combined and returned as a single result set.

Let’s look at an example below.

This query gives us the distinct results of two separate queries. That’s nice in thought, but we’re making the database work too hard. The only difference between the two queries is that one query is searching a range on order date, and the other is ship date. Each table is called twice. The execution plan below shows the work this query is doing.

union_cost

Notice how there are two scans on customer, two scans on person, two scans on SalesTerritory. There are two separate seeks on the indexes that are used for the where clause predicates.

The majority of the cost is used in the Sort of the execution plan. This is where this query is doing the work, and doing so in tempdb. In order for SQL Server to deliver one distinct recordset, it must use TEMPDB to sort each recordset, make sure that it is unique and join the two together.

We’ve accomplished a goal for a distinct recordset from two queries, but scalability will come into question. Let’s take a look at the statistics IO for this query below.

All 4 tables in the query are scanned twice. This will increase with the number of unions used. The more this query is called, the more it will be noticed.

Union All

So, how do we mitigate the costs of a Union. Remember what the biggest cost of the execution plan is? The sort(distinct) and merge. One way to mitigate this is the union all. The benefits of the union all are that it does not create a distinct union of the recordsets. Let’s examine the query below.

The only difference between the two queries is the “ALL” keyword. Let’s take a look at the execution plan.

union_all_cost

Now there is no sort(distinct) in the execution plan. There is no merge. We still have the two sets of scans against all of the tables, as shown in the execution plan and statistics IO below. The query did run about 800ms faster. Focus on the CPU time. The CPU time of this query is about 12% of the UNION query.

There is a problem however, we have more records in our recordset. We will address that later.

Using “OR”

So, we’ve shown that the union all performs much better than the union, but we still have multiple scans. In the query example that we have shown, the only difference is in the where clause. We are searching against the same tables, but two different date ranges on two different columns.

I’ve seen this many times, this can be simply altered by using an “or”(sparingly). Let’s see the query below.

We are querying each table only once. This is also shown in the execution plan as follows.

distinct_or_cost

The execution plan is much more simple. We are performing index scans on customer and person, and index seeks on indexes for SalesOrderHeaderEnlarged on the OrderDate and ShipDate columns. Along with the execution plan, let’s take a look at the statistics IO.

Person, Customer, and SalesTerritory all are scanned only once. Our total logical reads are going to go from about 2872 reads to 2431. We are saving 441 reads against the database. The CPU time of the “or” is roughly the same as the UNION, but the elapsed time is about 800ms less. Does this solve our problem? No. We still have too many records. Let’s see how we solve this.

The distinct “OR”

So, the “or” clause does decrease our overall reads and elapsed time, but we do still have too many records. In this case, we will need to make use of tempdb and use “distinct”. We need to use distinct sparingly, but in this case, it is necessary. The expected results count are 95667, but through our “or” and union all, we have seen repeated results in our recordset. As we show below, we will add distinct to the select clause. This will return a recordset with no repeating records.

Let’s examine execution plan for this change. We see now that 42% of the cost comes from the distinct sort. The plan is the same as “or” with the addition of the distinct sort.

distinct_or_cost

Let’s review the statistics IO. As shown below, we DO receive the desired amount of records. Unfortunately, our CPU cost has risen. Our reads are still lower than the union and our elapsed execution time is still 800ms less than the union all.

Conclusion

The union operator has its uses, but we can see that those uses come at a cost of reads and tempdb usage. At times, we may not have an alternative, but we need to be aware of the costs and use this functionality accordingly.

Up next: Taking advantage of Sargability

Share