Scalar Value and Built-In Functions

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Throughout my career, it has become my strong opinion that the majority of performance problems are self-inflicted due to databases design or coding that is less than optimal.  This series will address common coding malpractice and the ways to mitigate them.

I once heard that there are 3 types of developers in the IT world; application developers, database developers and the hybrid of the two.  A successful hybrid between database developers and application developers is rare as usually the two components require different ways of thinking.  Application development is object oriented with the use of methods and functions.  Database development is primarily set based.

Many application developers will employ object oriented coding practices in to their environments, and use the same techniques T-SQL.

Scalar Valued Functions as part of the Select clause

The first concept that I will focus on is Scalar Valued Functions.  Scalar Valued Functions are “typical” functions that accept a value or set of values and return a value.  This sounds fantastic, right?  Reuse code to get my information. If your goal is to reuse code, then you can be successful using scalar valued functions.  If your goal is to allow your application and database to scale and grow with minimal performance degradation, then you may want to avoid scalar valued functions.

 

Create and use the Function

Let’s look at the AdventureWorks2012 example below.  We’ve created a function that returns the order quantity for a sales order ID as shown below.

 

Now that we’ve got the function, how do we use it?  See the screenshot below.  We simply call the function.  What’s great about this?  We can change the code in the function and we won’t have to change code anywhere else.  Less management, but what is the tradeoff?

soq_001

Use the function within a select statement.

Now we will use the function within a select statement.  We don’t have to include SalesOrderDetailEnlarged in the query as we already reference it.

 

I’ve even included the execution plan below.  Index SEEK.  Just what we want, right?
soq_003

Notice that we have included our statistics to show how many read and writes we have against the database as well as the time taken.  ONLY 628 reads against the database.  Fantastic!  Let’s look at the same query doing the traditional join.

 

Traditional Join

Now we will query the same results, but querying the SalesOrderDetailEnlarged table directly.  So much work.

 

Let’s take a look at the execution plan.  77% of the cost is from a clustered index scan, this can’t be what I want.

soq_004

Let’s now review the IO and time statistics for this query.  So many more reads, the function must be the way to go, right?

 

Comparing the results

Did you notice the difference between the two sets of statistics?  The statement calling the function has 10% of the reads of the traditional query and the execution plan is seek on only one index, or so it appears.  There is one other glaring difference.  Notice the CPU Time and Elapsed Time for both.  For the call with the function, the CPU time is 23904 milliseconds and a total elapsed time of 23934 milliseconds.  For the traditional call, CPU time is 281 milliseconds with a TOTAL elapsed time of 957 milliseconds.  How can this be?  The execution plan for the statement with the function call is much simpler and there are less reads.  Could there be more?

The problem here is that the execution of the function is not part of the IO statistics or the execution plan.  The results are misleading.  In this case, the best way to review the actual work is through either a server side trace or extended events.  I’ve taken the liberty of recording the results using a server side trace and putting the results to a table to analyze.

Below you will see the results show 100,000 results for this query call.  Notice how the query was a “top 100000”.  This function is being called 100,000 times, once for each result!

soq_005

Now let’s take a look at the total cost of the query.  307346 reads.  The duration, in this case was also 23 seconds(I ran the query multiple times).

soq_006

How much of this cost came from the function? 306716 reads came from the function.

soq_007

Conclusion

Scalar Value Functions are great way to reuse code, but as we can see here, there are hidden catastrophes caused when using scalar valued functions on larger data sets.  This forces SQL Server to treat each record individually(aka row-by-row processing).  We may have more code to maintain by not using the function, but a little work for organizing a release will save migraines when analyzing performance.

Up next: The cost of the union…

Facebooktwittergoogle_plusredditpinterestlinkedinmail

5 thoughts on “Scalar Value and Built-In Functions

Leave a Reply

Your email address will not be published. Required fields are marked *