Skip to main content

The Purge SQL Agent Job for MDW takes a long time to complete

I use the dbWarden alerts to inform me if a SQL job is taking longer to complete than normal and I got one this morning:
 
Click to enlarge

I noticed by looking at the history this purge job was gradually taking longer and longer to complete each day since I installed it again (see mylast post on this):

RunDate
RunTime
Duration
ExecutionStatus
JobName
04/10/2013
02:00:01
00:00:45
Succeded
mdw_purge_data_[MDW]
05/10/2013
02:00:00
00:13:27
Succeded
mdw_purge_data_[MDW]
06/10/2013
02:00:00
00:17:03
Succeded
mdw_purge_data_[MDW]
07/10/2013
02:00:01
00:30:58
Succeded
mdw_purge_data_[MDW]
08/10/2013
02:00:01
01:04:25
Succeded
mdw_purge_data_[MDW]
09/10/2013
02:00:01
01:41:39
Succeded
mdw_purge_data_[MDW]
10/10/2013
02:00:01
02:35:17
Succeded
mdw_purge_data_[MDW]
11/10/2013
02:00:00
03:34:21
Succeded
mdw_purge_data_[MDW]
12/10/2013
02:00:01
04:12:51
Succeded
mdw_purge_data_[MDW]
13/10/2013
02:00:01
04:54:04
Succeded
mdw_purge_data_[MDW]
14/10/2013
02:00:00
05:38:58
Succeded
mdw_purge_data_[MDW]
15/10/2013
02:00:01
07:07:13
Succeded
mdw_purge_data_[MDW]
16/10/2013
02:00:00
08:21:11
Cancelled
mdw_purge_data_[MDW]

The last job I cancelled so I could figure out what was going on. A quick Google took me to this page which recommended adding a couple of indexes. Before I did so I took a look at the size of both the tables mentioned in the article:

sp_spaceused  '[snapshots].[query_stats]'

name
rows
reserved
data
index_size
unused
query_stats
5,702,294
599072 KB
587616 KB
11320 KB
136 KB

sp_spaceused '[snapshots].[notable_query_text]'

name
rows
reserved
data
index_size
unused
notable_query_text
26,383
34872 KB
33616 KB
272 KB
984 KB

...And the execution plan of the following query before I made any changes:

SELECT qt.[sql_handle]
INTO #tmp_notable_query_text_test
FROM snapshots.notable_query_text AS qt
WHERE NOT EXISTS (
   SELECT snapshot_id
   FROM snapshots.query_stats AS qs
   WHERE qs.[sql_handle] = qt.[sql_handle])

Execution Plan Before

Click to enlarge

You can clearly see that the scan of CIDX_query_stats costs by far the most of any operation in the query.
So I applied the following indexes as recommended in the article and looked at the execution plan of the query again:

USE [MDW]
GO

SET ANSI_PADDING ON
GO

CREATE NONCLUSTERED INDEX [IX_sql_handle]
    ON [snapshots].[query_stats]
          (
          [sql_handle] ASC
          )
       GO

CREATE NONCLUSTERED INDEX [IX_sql_handle]
    ON [snapshots].[notable_query_text]
          (
          [sql_handle] ASC
          )  
GO




Execution Plan After

Click to enlarge

Now various changes have happened to the query play including a Nested Loop operator replacing the Hash Match and an Index Seek replacing the Index Scan.

I ran the job manually and it completed much faster :

RunDate
RunTime
Duration
ExecutionStatus
JobName
16/10/2013
10:41:56
00:05:32
Succeded
mdw_purge_data_[MDW]

But much of the hard work would have already been done from the previous night’s schedule, so I will need to wait until tonight to see if the time had be reduced.

**Update**

It looks like the queries have done their job. The  duration has greatly improved:

RunDate
RunTime
Duration
ExecutionStatus
JobName
17/10/2013
02:00:00
00:21:00
Succeded
mdw_purge_data_[MDW]


A word of warning

Even though the MDW has been active for only two weeks and I am monitoring only one medium-busy server, nearly 14GB of data has been generated:

Click to enlarge




Comments

  1. Very useful. I have many customers that simply LOVE the Data Collector interface. Unfortunately, DC is full of problems (random errors in collection jobs, database growing too much, report errors).

    ReplyDelete

Post a Comment

Popular posts from this blog

How to move the Microsoft Assessment and Planning Toolkit (MAP) database to a different drive

The Microsoft Assessment and Planning Toolkit (MAP) is a very useful tool for scanning your network to find instances of SQL Server plus all manner of detailed information about the installed product, OS and hardware it sits on.


<Click image to enbiggen>
There is an issue with it the database it uses to store the data it collects, however. Assuming you don't have an instance called MAPS on your server, the product will install using LocalDB (a cut down version of SQL Server Express) and puts the databases on your C: drive. If you then scan a large network you could easily expand the database to 10GB which may cause issues on a server when that drive is often one of the smallest. However, there is a simple solution: connect to LocalDB using Management Studio, detach the databases, move to a different drive, set permissions on the new location if required and reattach the database. How do you connect to LocalDB? Here you go:

Connect to (localdb)\MAPTOOLKIT


The databases I move…

SAN performance testing using SQLIO

Introduction
This document describes how to use Microsoft’s SQLIO to test disk/SAN performance. It is biased towards SQL Server – which uses primarily 64KB and 8KB data pages so I am running the tests using those cluster sizes, however, other sizes can be specified.  Download SQLIO from https://www.microsoft.com/en-gb/download/details.aspx?id=20163 SQLIO is a command line tool with no GUI so you need to open a command prompt at C:\Program Files (x86)\SQLIO after you have installed it. Configuration First of all edit param.txt so that you create the test file we will be using. The file needs to be bigger than the combined RAID and on-board disk caches. In this case we are using a 50GB file.
The “2” refers to the number of threads to use when testing, you don’t need to change this now. The “0x0” value indicates that all CPUs should be used, which you probably don’t want to change either, “#” is a comment. The only part you may want to change is 51200 (50GB) and the drive letter. After …

How to configure the SSAS service to use a Domain Account

NB Updating SPNs in AD is not for the faint hearted plus I got inconsistent results from different servers. Do so at your own risk! If you need the SSAS account on a SQL Server to use a domain account rather than the local “virtual” account “NT Service\MSSQLServerOLAPService”. You may think you just give the account login permissions to the server, perhaps give it sysadmin SQL permissions too. However, if you try and connect to SSAS remotely you may get this error:

Authentication failed. (Microsoft.AnalysisService.AdomdClient) The target principal name is incorrect (Microsoft.AnalysisService.AdomdClient)

From Microsoft: “A Service Principle Name (SPN) uniquely identifies a service instance in an Active Directory domain when Kerberos is used to mutually authenticate client and service identities. An SPN is associated with the logon account under which the service instance runs. For client applications connecting to Analysis Services via Kerberos authentication, the Analysis Services clien…