October 2011 SQL Server 2008 Diagnostic Information Queries

Since Microsoft has been so busy releasing Cumulative Updates for both SQL Server 2008 and SQL Server 2008 R2 over the past few days, I thought it would be a good time to release a new version of my SQL Server 2008 Diagnostic Information Queries. Most of these queries will work on either SQL Server 2008 or SQL Server 2008 R2, but a few (which are noted in the comments) only work on SQL Server 2008 R2 SP1 or later.

That is yet another reason to read the instructions, and run each query one at a time, rather than running them all in one batch. Most of the queries also require VIEW SERVER STATE permission.

The top half of the queries are instance level, so it does not matter what database you are pointing to when you run them, but the second half of the queries are database specific, which means you don’t want to be pointing at the master database when you run them.

Please let me know what you think of these queries. Thanks!

This entry was posted in SQL Server 2008, SQL Server 2008 R2 and tagged . Bookmark the permalink.

10 Responses to October 2011 SQL Server 2008 Diagnostic Information Queries

  1. Jason says:

    I love your work on this, and some of the new information available in SP1 of R2. The mount point information from dm_os_volume_stats is exciting for those of us with mutiple instances on a SQL Cluster and using Mount Points. I have a question, in your query of dm_os_volume_stats, you report available_bytes / total_bytes as %used…. don’t you mean %free?

  2. Dev Nambi says:

    Hi Glenn,

    I’ve been using these for a few weeks now, and they are *fantastic*. I’ve been training my DBAs to use these to look deeper for performance problems, bottlenecks, and waits, and it is incredibly useful to have these pre-written. They are also very comprehensive, so I don’t have to modify them very often to get what I need.

    Assuming you have no objections, I’d like to put them in a utility/tools database that I use.

  3. Amanda says:

    I feel foolish, but i don’t quite understand what the “running percent” is supposed to be in the wait stat query. Also, I have found quite a few other wait types that i’m wondering if should be excluded…do you still recommend evaluting everything that isn’t in the list?

  4. Glenn Berry says:

    Amanda,
    No need to feel foolish. The running percent is just the running total percent of all top wait types. If the top wait type was 60%, and the second top wait type was 26%, the the running total of the two would be 86%.

    It depends on what other wait typpes that you are seeing that are not in the exclude list. How is your server running, overall? If it is not under stress, then the top wait types are not quite as important.

  5. Amanda says:

    We have a LOT of problems. I use your script to capture wait stats every night, after which i clear them out, so this is for one of our servers for one day:
    Wait Stats For 12/05/2011
    Wait Type Total Wait Time Percent
    WRITELOG 143038.46 19.19
    OLEDB 135694.05 18.21
    LCK_M_X 119374.26 16.02
    TRACEWRITE 86055.76 11.55
    CXPACKET 56707.93 7.61
    LCK_M_IX 49726.20 6.67
    SOS_SCHEDULER_YIELD 22515.13 3.02
    PAGEIOLATCH_EX 22202.33 2.98
    IO_COMPLETION 17219.48 2.31
    LCK_M_U 14942.88 2.00
    LCK_M_IS 11533.95 1.55
    PAGEIOLATCH_SH 11381.42 1.53
    LCK_M_S 10314.74 1.38
    LATCH_EX 7836.33 1.05

    I’ve done some research, and based on what I found I’m filtering out the following – in alphabetical order 😉 I am not sure if given our performance issues I should be including these or not…

    WHERE wait_type NOT IN (

    ‘BROKER_EVENTHANDLER’,
    ‘BROKER_RECEIVE_WAITFOR’,
    ‘BROKER_TASK_STOP’,
    ‘BROKER_TO_FLUSH’,
    ‘BROKER_TRANSMITTER’,
    ‘CHECKPOINT_QUEUE’,
    ‘CLR_AUTO_EVENT’,
    ‘CLR_MANUAL_EVENT’,
    ‘CLR_SEMAPHORE’,
    ‘DISPATCHER_QUEUE_SEMAPHORE’,
    ‘FT_IFTS_SCHEDULER_IDLE_WAIT’,
    ‘FT_IFTSHC_MUTEX’,
    ‘KSOURCE_WAKEUP’,
    ‘LAZYWRITER_SLEEP’,
    ‘LOGMGR_QUEUE’,
    ‘ONDEMAND_TASK_QUEUE’,
    ‘REQUEST_FOR_DEADLOCK_SEARCH’,
    ‘RESOURCE_QUEUE’,
    ‘SLEEP_BPOOL_FLUSH’,
    ‘SLEEP_SYSTEMTASK’,
    ‘SLEEP_TASK’,
    ‘SQLTRACE_BUFFER_FLUSH’,
    ‘SQLTRACE_INCREMENTAL_FLUSH_SLEEP’,
    ‘WAITFOR’,
    ‘XE_DISPATCHER_JOIN’,
    ‘XE_DISPATCHER_WAIT’,
    ‘XE_TIMER_EVENT’

    )

  6. Dev Nambi says:

    Amanda:

    You may want to check the speed & disk configuration of your transaction log drive(s). Also, OLEDB can happen due to linked server, openquery, openrowset.

    Also, there’s a *fantastic* document at http://technet.microsoft.com/en-us/library/cc966413.aspx that goes over all of the different waits and queue types, how to troubleshoot them, and what to look for.

    Cheers,
    Dev

  7. Amanda says:

    Dev:
    Yeah, we do have an i/o problem, due in no small part to poorly written queries, improperly configured SAN and horrible indexing. It’s actually worse than above, I realized I messed the query up when I pasted in the data, it’s actually:

    WRITELOG 131781.00 27.29
    TRACEWRITE 86045.67 17.82
    CXPACKET 79018.41 16.37
    OLEDB 57796.44 11.97
    SOS_SCHEDULER_YIELD 32907.11 6.82
    LCK_M_IX 16830.55 3.49
    IO_COMPLETION 12155.73 2.52
    PAGEIOLATCH_SH 9532.24 1.97
    LATCH_EX 8669.45 1.80
    LCK_M_IS 8666.69 1.79
    ASYNC_NETWORK_IO 6179.78 1.28
    PAGEIOLATCH_EX 4914.67 1.02
    ASYNC_IO_COMPLETION 4905.90 1.02

    • Dev Nambi says:

      Amanda,

      I feel your pain. I’ve recently inherited some databases that have badly written queries and a craptastic SAN, and even doing some tuning/fixes is proving to be a nightmare.

      It’s interesting that TRACEWRITE is 17.8% of your waits. Do you have a lot of profiler tracers running? Maybe server-side traces?

  8. Amanda says:

    I’ve been trying to figure that out…Joe Sack had a great query that looks at traces running and who is running them but i don’t see anything other than the default trace and the quest deadlock trace (SOSS), and i’m trying to find out how much overhead that really is. I’m trying to figure out if i should even continue to include that wait type.

Leave a reply to Amanda Cancel reply