Using computed columns to solve performance problems

Some times when you are working with 3rd party vendor code, or maintaining legacy code, it can be tricky especially when a query doesn’t perform. Modifying the query by adding hints is one way to address the issue but more than often you will find your hands are tied and unable to change a single line of code. I had to deal with something like that once and fortunate enough to have computed columns at my disposal.

A computed column although easy to understand is probably not the first tool people associate with performance tuning. Yet it can be a very powerful weapon in your performance tuning arsenal. I will demonstrate with an example.

Say you are working with a table and a view like the ones below.

use tempdb go

if object_id('trade_id_lookup') is not null drop table trade_id_lookup go

create table trade_id_lookup ( trade_id bigint not null identity(1, 1) primary key , system_id int not null , portfolio sysname not null , product_id int not null ) go if object_id('v_trade_id_lookup') is not null drop view v_trade_id_lookup go

create view v_trade_id_lookup as select cast(trade_id as nvarchar) as 'trade_id' , system_id , portfolio , product_id from trade_id_lookup go

insert trade_id_lookup (system_id, portfolio, product_id) values (1, 'portfolio', 9) go 10000

The view is then used throughout the system for looking up trade attributes given a trade_id.

select system_id, portfolio, product_id
from dbo.v_trade_id_lookup
where trade_id = '999'

It all looks quite innocent until you start to interrogate the execution plan.image

A clustered index scan, why? Due to the CAST expression embedded in the view, the optimizer is not able to perform an index seek because it had to convert every single row. If you hover over the index scan icon, it will show something like:

(CONVERT(nvarchar(30),[tempdb].[dbo].[trade_id_lookup].[trade_id],0)=CONVERT_IMPLICIT(nvarchar(4000),[@1],0))

Bear in mind that I intentionally stripped away all the non essential code for the demo. Imagine looking at a large execution plan filled with joins and index scans, it is not an easy task to spot something like this without looking at the code inside the view. This is one of the reasons I hate deeply nested views, they are a product of object-oriented paradigm and should not belong in the database world. Or maybe it’s because of my inability to understand INCEPTION.

Now we know what’s causing the performance issue, what can we do about that? One quick answer would be to remove the CAST expression inside the view. I wish it was that simple, remember we are working with code that we don’t necessarily understand its intent. Modifying queries albeit a seemingly simple line is fraught with danger and should be your last sort.

So we cannot change the queries nor the view. Then what? Fortunately, I could modify the underlying table, provided that no business logic was modified. This is where the computed column comes in handy.

alter table trade_id_lookup
add trade_id_string as cast(trade_id as nvarchar)
go 
create index idx_louie on trade_id_lookup (trade_id_string)
go

By copying the CAST logic from the view into a computed column, we achieved two things:

  1. An index can now be created on the converted trade_id.
  2. No code has to be modified and it all just magically works.

Executing the SELECT query again will yield the execution plan below, see how the index seek is now in use?

image

The key lookup is a by-product of the demo, you can optimize the query further by adding the referenced columns in the INCLUDE list of the index.

–oo00oo–

To understand a bit more about how the optimizer utilises indexes on computed columns, please refer to Paul White’s answer to the question Index on Persisted Computed column needs key lookup to get columns in the computed expression

Got “Cannot connect to WMI provider” when tried to open SQL Server Configuration Management

I tried to open SQL Server Configuration Manager one day and got a sad face error dialog:

2014-07-30 SQL Server Configuration Management error

A quick search on google returned a few positive results. One of them looked especially promising: http://methmal132.wordpress.com/2013/10/04/how-to-resolve-the-sql-error-cannot-connect-to-wmi-provider

The blog post is a replicate of the knowledge base article: http://support.microsoft.com/kb/956013 but with pretty pictures.

If you are here because you had the same issue then you can stop reading if the above workaround resolved your issue. Unfortunately for me, I had compounded issues. It turned out my WMI was corrupt and I had to fix that first.

2014-07-30 WMI Property Dialog

Off to google again and luckily I found this post: http://katyscode.wordpress.com/2007/02/03/tutorial-how-to-fix-wmi-corruption

Impressed with the level of technical details contained in the post, I religiously followed the instructions to rebuild the WMI repository word to word.

2014-07-30 restart winmgmt and rename repository

2014-07-30 wmi property dialog fixed

Once the WMI woes were sorted, I returned to the KB article for the workaround and that fixed my issue eventually!

2014-07-30 mof worked

CHARINDEX actually works with text data type, but…

We had a table which was used to store error logs in a text column. From time to time I needed to extract some “error” messages out and I always resorted to charindex. Most of the time it worked just fine but sometimes I would get an empty result set back. As this behaviour wasn’t any show stopper so I never paid too much attention to it until one day I decided to get to the bottom of this seemingly random behaviour.

As usual, when I am uncertain about a system function, I would go to Books Online and read the definition, repeatedly at times. After scrolling down to the remarks section, a close inspection revealed something startling:

CHARINDEX cannot be used with text, ntext, and image data types.

This is simply not true! I know because I have been using charindex with text for months if not years. I am speculating that the author actually meant:

CHARINDEX should not be used with text, ntext, and image data types.

Now, I am going to show you with a simple demo that charindex CAN be used with text. However, there is a catch.

Let’s start with the script below:

-- Create a temp table for demo purpose.
declare @charindex_test table
( 
    text_data       text 
    , varchar_data  varchar(max) 
)

/*
Pretend that we have a long string with the
word error in there. Note the position of the
characters. 
 
Pos: 7995 7996 7997 7998 7999 8000
Char:   p    e    r    r    o    r
*/
insert @charindex_test (varchar_data) 
values (replicate(cast('p' as varchar(max))
        , 8000 - len('error')) + 'error') 

/*
Pretend that we have a long string with the
word error in there. Note the position of the
characters. 
 
Pos: 7996 7997 7998 7999 8000 8001
Char:   p    e    r    r    o    r
*/
insert @charindex_test (varchar_data) 
values (replicate(cast('p' as varchar(max))
        , 8001 - len('error')) + 'error')

-- Copy the content to a text column
update @charindex_test 
set text_data = varchar_data

-- Demonstrate the effect of using charindex with text
select 
    text_data 
    , varchar_data
    , datalength(text_data)            as 'text_data_bytes' 
    , charindex('error', text_data)    as 'text_data_pos' 
    , datalength(varchar_data)         as 'varchar_data_bytes' 
    , charindex('error', varchar_data) as 'varchar_data_pos' 
from @charindex_test

select *
from @charindex_test
where text_data like '%error%'

By running the entire script, we’ll get the output like the display below:

image

From this simple demo, we can make the following observations:

  • CHARINDEX can be used with text. However, there is an 8000 character limit.
  • LIKE on the other hand does not suffer from the 8000 characters limitation.
  • If you have to rely on CHARINDEX, then cast text as varchar(max) first.
  • You cannot blindly trust the official documentation. Always test and see with your own eyes.

A tale of SQL Server installation

I had to reinstall SQL Server 2000 on a box a few weeks ago. The mission sounded simple when I first started but quickly turned into an eventful drama. Below is a recount of the events I had experienced.

One afternoon, a user reported that he could no longer run reports against a database. He also claimed to have successfully extracted data earlier in the day. The symptom didn’t sound too complicated.

After having difficulty connecting to the SQL Server from SSMS, I went onto the box and discovered that the SQL Server service was stopped.

That’s strange but still no big deal. I clicked on the green start button in the service manager, once…twice…three times and the server was still not started. I sensed something was not right.

A trip to the event viewer revealed the problem:

17052: SQL Server evaluation period has expired.

Huh? All along I didn’t realise we were running an evaluation version. Now what? We had proper licences and a reinstallation would be the logical step but due to the time constraint, I didn’t have time to mark around, it was getting late and the user needed his data urgently.

One of my team mate suggested a dirty quick trick at this point: move the system clock back one day and pretend we were still within the evaluation period. You know what it actually worked. I thought I was smart but this guy was a real “problem solver”.

The system clock trick was nice but it was just a sticky tape solution, we really needed to install a proper version.

Came next day, I naturally selected the enterprise edition (since it was to replace the evaluation enterprise edition) and happily clicked on setup.exe.

Microsoft SQL server 2000 enterprise edition server component is not supported on this operating system, Only client component will be available for installation.

Oops, I later found out that neither enterprise nor standard edition was supported on a Windows XP box, so I was left with no choice but to settle on a developer edition. Fast forward and I clicked on the setup.exe again but this time I was presented with a different message:

A previous program installation created pending file operations on the installation machine. You must restart the computer before running setup.

Fine, so I restarted the box…three times and the message was still lingering around. Between the restarts, I had uninstalled SQL Server 2000 (and inadvertently lost all the DTS packages stored in msdb, luckily I kept a pretty recent backup of msdb, phew!)

In the end, I had to delete a registry entry following instructions from the KB article: SQL Server 2000 installation fails with “…previous program installation…” error message

Thirty minutes later, SQL Server 2000 was successfully installed. OK, time to restore the msdb database to recover all the DTS packages and jobs, etc. It didn’t take me long to locate the msdb backup and type in the command to restore it…

Msg 3168, Level 16, State 1, Line 1
The backup of the system database on device C:\Backup\msdb.bak cannot be restored because it was created by a different version of the server (134219767) than this server (134217922).

Now what?! After staring at the error message for a while, I remembered that the previous SQL Server was on SP4. It got to be it.

Click…click…click, SP4 was installed without any hiccups. Attempted the msdb restore again and thank god it worked this time. I had finally restored the server to the state it was two day ago. What a “fun” experience!

How a wrong join made the query MUCH slower

I was monitoring a bunch of batch jobs one day and noticed one job was taking much longer time than usual. Normally the job completes around 3 minutes but this time it had been running for 3 hours and still going.

After running a series of troubleshooting commands like “sp_who2 active” and “dbcc inputbuffer()”, I relied on the query below to identify the runaway query.

select 
    s.session_id  as 'session_id'
    , s.status                  as 'session_status'
    , r.status                  as 'request_status'
    , s.last_request_start_time as 'last_request_start_time'
    , db_name(r.database_id)    as 'database_name'
    , t.text                    as 'command'
    , r.wait_type               as 'wait_type'
    , r.blocking_session_id     as 'blocking_session_id'
    , s.login_name              as 'login_name'
    , s.host_name               as 'host_name'
    , s.program_name            as 'program_name'
    , p.query_plan              as 'query_plan'
from sys.dm_exec_sessions s
left join sys.dm_exec_requests r     on s.session_id = r.session_id
outer apply sys.dm_exec_sql_text(r.sql_handle) t
outer apply sys.dm_exec_query_plan(r.plan_handle) p
where s.is_user_process = 1
and s.status = 'running'
and s.session_id != @@spid

Both blocking_session_id and wait_type were null, the only useful information that could be extracted from the query was the execution plan:

query_plan_nested_loop

Before I go on and explain the execution plan, I will provide some background information. The table was joined against itself and it had 3.8 million rows. Worst of all, the table didn’t have any indexes on it at all, not even a clustered index. Blessed with such insight, it was not difficult to see that the execution plan was sub-optimal. The infamous table scans were due to the lack of indexes, unforgivable nonetheless they were not the main offender. The nested loops were really the culprit killing the query. How so? For every row out of 3.8 million, it was scanning through 3.8 million rows, it could be illustrated by the formula below:

3,800,000 x 3,800,000 = 14,440,000,000,000

Adding an index to the table (and fixing statistics at the same time) was the most sensible approach for improving the performance on this query. However, the aim of this post is to demonstrate the effect of a different join operator on the query.

I was able to locate a historical execution plan which was generated by the query optimizer. The hash join in the execution plan made it obvious why the query never had any problems in the past.

query_plan_hash_join

This was the same time as saying:

3,800,000 + 3,800,000 = 7,600,000

What an astronomical difference! Once the root cause was identified, it was quite easy to work out what to do about it. A solution could be:

  • Updating the statistics on the table;
  • Implementing a join hint;
  • Or applying the most effective performance tuning technique: adding an index.

An index was added in the end and the job was completed in no time.

If you would like to get a deeper understanding of the join operators or query execution in general, I highly recommend Craig Freeman’s blog. Below is what he wrote about hash join:

When it comes to physical join operators, hash join does the heavy lifting. While nested loops join works well with relatively small data sets and merge join helps with moderately sized data sets, hash join excels at performing the largest joins. Hash joins parallelize and scale better than any other join and are great at maximizing throughput in data warehouses.

A solution for searching inside DTS packages – Part 2 / 2

In my previous blog post, I had talked about finding a script to be used in my search solution. Once you have that, the rest of the puzzle just falls into place.

I created an SSIS package to call the VBScript and then load the output files into a SQL database.

ScriptDTSPackages

As depicted in the screenshot above, the solution basically consisted of two loops. One for iterating through the DTS packages on the file system, another for importing the output files into a SQL database.

Due to the simplicity of the solution, I won’t go into details showing how the whole thing was built. I do however have a few tips for you if you are considering building one to suit your purpose.

1. In the Execute Process Task, wrap the VBScript inside a batch file:
cscript //nologo “C:\Script_Single_DTS_Package.vbs” “/i:%1″ “/o:%2″

2. Use a Flat File source for the file. The content of the file should all be loaded into a single column.

I used SSIS because that’s available, the same solution could’ve been implemented entirely in a scripting language such as Powershell or better still using a good old DTS package to complete the cycle. Winking smile

A solution for searching inside DTS packages – Part 1 / 2

Being able to search through all your source code for impact analysis is nice when you are undergoing database refactoring, it is especially important if not all your code is wrapped up in stored procedures.

As I was working extensively with (hundreds of) DTS packages at one stage, it was not always feasible to open up each package and inspect the SQL statements within, a more efficient approach was obviously required.

It didn’t take me long to find a script which would become the core component of my solution. All I needed to do was to modify the script so it could work with DTS packages that were stored on the file system. In addition, I went one step further and loaded all the text output into a database and in the end I had effectively built a sys.sql_modules table for DTS packages.

My script can be found here. In my next blog post, I will show you how I put all the pieces together and formed a working solution.

Not dropping temp tables is better for performance

I have wondered about this one for some time now but have never got to the bottom of it. Whether to drop temp tables at the end of stored procedures or not, I never thought it really mattered to performance given the beefy machines we get to command these days.

I have asked a few fellow DBAs and the response was usually along the line of “yeah you should drop temp tables, because it is a good practice.” That’s just not good enough for me.

After reading Paul White’s blog post one day, an idea came to mind. I could use sys.fn_dblog to see if dropping temp tables made any differences. I then tried out a simple test and reached a conclusion for my peace of mind.

I will show you the test I have done and the findings. The code is pasted below if you would like to try it out yourself.

-- Create two procs, one with DROP TABLE and one without. use tempdb go create proc dbo.DropTempTable as create table #demo (i int) drop table #demo go create proc dbo.NoDropTempTable as create table #demo (i int) go

Once the procs are created, you can then run the example.

checkpoint -- in order to reset the fn_dblog output
exec dbo.DropTempTable -- Run the first proc
select
    operation,
    [transaction name],
    [transaction id]
from sys.fn_dblog(null, null)
go

checkpoint
exec dbo.NoDropTempTable -- Run the second proc
select
    operation,
    [transaction name],
    [transaction id]
from sys.fn_dblog(null, null)
go

The example above should be self-explanatory. What I want to show you is the output from fn_dblog:

image

As you can see, the transaction log entries were identical except the extra transaction required for dropping the temp table in the first proc. Looking at the result, it is reasonable to conclude that NOT dropping temp table is better for performance because SQL Server has less work to do.

To me that settled it, why wouldn’t I write less code if it is better for performance?

–oo00oo–
For more info, please refer to the following links:
Explicitly dropping Temporary Tables and performance – Myth or Fact?

How to format nested REPLACE statements cleanly and efficiently

Replace is one of my favourite T-SQL string functions. One of its main attractions is the ability to remove noise words from a large chunk of text. Most of the time though, you are likely to have quite a few word patterns that you want to filter out, and this will make the code hard to read, see the example below:

select replace(replace(replace(message, 'Microsoft (R) SQL Server Execute Package Utility', ''), 'Version 10.0.5500.0', ''), 'Code: 0xC0202009', '') from msdb..sysjobhistory where run_status = 0

It only takes three to four nested replace statements to make the whole select statement look like a wiggly snake, not to mention those hard to track commas and brackets.

Below is a much better way of writing nested REPLACE statements:

select replace(replace(replace(message
    , 'Microsoft (R) SQL Server Execute Package Utility', '')
    , 'Version 10.0.5500.0', '')
    , 'Code: 0xC0202009', '')
from msdb..sysjobhistory
where run_status = 0

By utilising one line per REPLACE, it instantly becomes clear what you are replacing and the people maintaining you code will thank you for the readability.

Dropdown box in DTS Transform Data Task is too narrow

It is a bit unfortunate that some of us today still need to work with DTS packages. When confronted with problems from using the antique toolset, the first thing that one does naturally was to curse the longevity of old technologies.

I have encountered one such problem today working with a simple DTS package. No matter how much I wished for SSIS for my next Christmas present, it was not December. At the end of the day, the problem still needed to be addressed in the DTS package.

The package had a transform data task which loaded a file into a database. I was happily clicking around until I ended up at the destination tab and stopped by the dreadful dropdown box. It was not wide enough to show the full table names, all I could see was the first few characters of the table names. How would I know which table I have selected?

TransformDataTask_SecondaryMonitor

I was pretty sure there must be a way to get around it, after all DTS packages have been around for yonks, someone must have seen this behaviour and reported it. After searching on Google for 2 minutes, I happened to stumble across this little gem. I couldn’t believe my luck, the solution was dead simple but who would’ve guessed. All I had to do was to move the DTS designer window to the primary monitor!

TransformDataTask_PrimaryMonitor