A Small Collection of I/O Specific DMV Queries

It is pretty common for large, busy SQL Server implementations to run into I/O pressure or bottlenecks, especially since many people seem to buy “big” servers with lots of CPU capacity and lots of RAM, but they treat the I/O subsystem as an afterthought. I have run into this situation enough lately, that I thought I would write a little about it.

The situation is not improved when you have SQL Server DBAs and SAN engineers working for different parts of large organizations, that may have different goals. If you suspect that your SQL Server instance is having I/O issues, these queries can help furnish proof that you are under I/O pressure, and they can help pinpoint what is causing the problem.  As with most DMV queries, you will need VIEW SERVER STATE permission to run most of them.

I always like to run the Top Wait Stats rollup query first. If you see I/O related wait types at the top of the list, then you should start looking at the problem in more detail with the rest of the queries below:

-- Isolate top waits for server instance since last restart or statistics clear
WITH Waits AS
(SELECT wait_type, wait_time_ms / 1000. AS wait_time_s,
100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct,
ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn
FROM sys.dm_os_wait_stats
WHERE wait_type NOT IN ('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK'
,'SLEEP_SYSTEMTASK','SQLTRACE_BUFFER_FLUSH','WAITFOR', 'LOGMGR_QUEUE','CHECKPOINT_QUEUE'
,'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','BROKER_TO_FLUSH','BROKER_TASK_STOP','CLR_MANUAL_EVENT'
,'CLR_AUTO_EVENT','DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT'
,'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN'))
SELECT W1.wait_type, 
CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s,
CAST(W1.pct AS DECIMAL(12, 2)) AS pct,
CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct
FROM Waits AS W1
INNER JOIN Waits AS W2
ON W2.rn <= W1.rn
GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct
HAVING SUM(W2.pct) - W1.pct < 95; -- percentage threshold

-- Common Significant I/O Wait types with BOL explanations

-- *** I/O Related Waits ***
-- ASYNC_IO_COMPLETION  Occurs when a task is waiting for I/Os to finish
-- IO_COMPLETION        Occurs while waiting for I/O operations to complete. 
--                      This wait type generally represents non-data page I/Os. Data page I/O completion waits appear 
--                      as PAGEIOLATCH_* waits
-- PAGEIOLATCH_SH        Occurs when a task is waiting on a latch for a buffer that is in an I/O request. 
--                      The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem.
-- PAGEIOLATCH_EX        Occurs when a task is waiting on a latch for a buffer that is in an I/O request. 
--                      The latch request is in Exclusive mode. Long waits may indicate problems with the disk subsystem.
-- WRITELOG             Occurs while waiting for a log flush to complete. 
--                      Common operations that cause log flushes are checkpoints and transaction commits.
-- PAGELATCH_EX            Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. 
--                      The latch request is in Exclusive mode.
-- BACKUPIO                Occurs when a backup task is waiting for data, or is waiting for a buffer in which to store data
 
 

-- Some I/O Specific DMV Queries
-- Glenn Berry
-- December 2009
-- http://glennberrysqlperformance.spaces.live.com/
-- Twitter: GlennAlanBerry

-- Always look at Avg Disk Sec/Read and Avg Disk Sec/Write in PerfMon for each Physical Disk 

-- Check for IO Bottlenecks (run multiple times, look for values above zero)
SELECT cpu_id, pending_disk_io_count 
FROM sys.dm_os_schedulers
WHERE [status] = 'VISIBLE ONLINE';

-- Look at average for all schedulers (run multiple times, look for values above zero)
SELECT AVG(pending_disk_io_count) AS [AvgPendingDiskIOCount]
FROM sys.dm_os_schedulers 
WHERE [status] = 'VISIBLE ONLINE';


-- High Latch waits (SH and EX) indicates the I/O subsystem is too busy 
-- the wait time indicates time waiting for disk
SELECT wait_type, waiting_tasks_count, wait_time_ms, signal_wait_time_ms,
       wait_time_ms - signal_wait_time_ms AS [io_wait_time_ms]
FROM sys.dm_os_wait_stats
WHERE wait_type IN('PAGEIOLATCH_EX', 'PAGEIOLATCH_SH', 'PAGEIOLATCH_UP')
ORDER BY wait_type;

-- Analyze Database I/O, ranked by IO Stall%
WITH DBIO AS
(SELECT DB_NAME(IVFS.database_id) AS db,
 CASE WHEN MF.type = 1 THEN 'log' ELSE 'data' END AS file_type,
 SUM(IVFS.num_of_bytes_read + IVFS.num_of_bytes_written) AS io,
 SUM(IVFS.io_stall) AS io_stall
 FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS IVFS
 INNER JOIN sys.master_files AS MF
 ON IVFS.database_id = MF.database_id
 AND IVFS.file_id = MF.file_id
 GROUP BY DB_NAME(IVFS.database_id), MF.[type])
SELECT db, file_type, 
  CAST(1. * io / (1024 * 1024) AS DECIMAL(12, 2)) AS io_mb,
  CAST(io_stall / 1000. AS DECIMAL(12, 2)) AS io_stall_s,
  CAST(100. * io_stall / SUM(io_stall) OVER()
       AS DECIMAL(10, 2)) AS io_stall_pct,
  ROW_NUMBER() OVER(ORDER BY io_stall DESC) AS rn
FROM DBIO
ORDER BY io_stall DESC;


-- Average stalls per read, write and total
SELECT DB_NAME(database_id) AS [Database Name], file_id, io_stall_read_ms, num_of_reads,
CAST(io_stall_read_ms/(1.0+num_of_reads) AS numeric(10,1)) AS [avg_read_stall_ms],
io_stall_write_ms, num_of_writes,
CAST(io_stall_write_ms/(1.0+num_of_writes) AS numeric(10,1)) AS [avg_write_stall_ms],
io_stall_read_ms + io_stall_write_ms AS io_stalls,
num_of_reads + num_of_writes AS total_io,
CAST((io_stall_read_ms+io_stall_write_ms)/(1.0+num_of_reads + num_of_writes) 
AS numeric(10,1)) AS [avg_io_stall_ms]
FROM sys.dm_io_virtual_file_stats(null,null)
ORDER BY avg_io_stall_ms DESC;


-- Calculates average stalls per read, per write, and per total input/output for each database file. 
SELECT DB_NAME(database_id) AS [Database Name], file_id ,io_stall_read_ms, num_of_reads,
CAST(io_stall_read_ms/(1.0 + num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms],io_stall_write_ms, 
num_of_writes,CAST(io_stall_write_ms/(1.0+num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms],
io_stall_read_ms + io_stall_write_ms AS [io_stalls], num_of_reads + num_of_writes AS [total_io],
CAST((io_stall_read_ms + io_stall_write_ms)/(1.0 + num_of_reads + num_of_writes) AS NUMERIC(10,1)) 
AS [avg_io_stall_ms]
FROM sys.dm_io_virtual_file_stats(null,null)
ORDER BY avg_io_stall_ms DESC;


-- Which queries are causing the most IO operations (can take a few seconds)
SELECT TOP (20) total_logical_reads/execution_count AS [avg_logical_reads],
    total_logical_writes/execution_count AS [avg_logical_writes],
    total_worker_time/execution_count AS [avg_cpu_cost], execution_count,
    total_worker_time, total_logical_reads, total_logical_writes, 
    (SELECT DB_NAME(dbid) + ISNULL('..' + OBJECT_NAME(objectid), '') 
     FROM sys.dm_exec_sql_text([sql_handle])) AS query_database,
    (SELECT SUBSTRING(est.[text], statement_start_offset/2 + 1,
        (CASE WHEN statement_end_offset = -1
            THEN LEN(CONVERT(nvarchar(max), est.[text])) * 2
            ELSE statement_end_offset
            END - statement_start_offset
        ) / 2)
        FROM sys.dm_exec_sql_text(sql_handle) AS est) AS query_text,
    last_logical_reads, min_logical_reads, max_logical_reads,
    last_logical_writes, min_logical_writes, max_logical_writes,
    total_physical_reads, last_physical_reads, min_physical_reads, max_physical_reads,
    (total_logical_reads + (total_logical_writes * 5))/execution_count AS io_weighting,
    plan_generation_num, qp.query_plan
FROM sys.dm_exec_query_stats
OUTER APPLY sys.dm_exec_query_plan([plan_handle]) AS qp
WHERE [dbid] >= 5 AND (total_worker_time/execution_count) > 100
ORDER BY io_weighting DESC;



-- Top Cached SPs By Total Physical Reads (SQL 2008). Physical reads relate to disk I/O pressure
SELECT TOP(25) p.name AS [SP Name],
qs.total_physical_reads AS [TotalPhysicalReads], qs.total_physical_reads/qs.execution_count AS [AvgPhysicalReads],
ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0) AS [Calls/Second],
qs.total_logical_reads AS [TotalLogicalReads], qs.total_logical_reads/qs.execution_count AS [AvgLogicalReads],
qs.total_worker_time AS [TotalWorkerTime], qs.total_worker_time/qs.execution_count AS [AvgWorkerTime], 
qs.execution_count, 
qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count AS [avg_elapsed_time], qs.last_elapsed_time,
qs.cached_time 
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
ORDER BY qs.total_physical_reads DESC;
    
    
-- Top Cached SPs By Total Logical Writes (SQL 2008). Logical writes relate to both memory and disk I/O pressure 
SELECT TOP(25) p.name AS [SP Name],
qs.total_logical_writes AS [TotalLogicalWrites], qs.total_logical_writes/qs.execution_count AS [AvgLogicalWrites],
ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0) AS [Calls/Second],
qs.total_logical_reads AS [TotalLogicalReads], qs.total_logical_reads/qs.execution_count AS [AvgLogicalReads],
qs.total_worker_time AS [TotalWorkerTime], qs.total_worker_time/qs.execution_count AS [AvgWorkerTime], 
qs.execution_count, 
qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count AS [avg_elapsed_time], qs.last_elapsed_time,
qs.cached_time
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
ORDER BY qs.total_logical_writes DESC;

This entry was posted in SQL Server 2008. Bookmark the permalink.

17 Responses to A Small Collection of I/O Specific DMV Queries

  1. John says:

    Hi Glenn,I’m a huge fan of using DMV’s to troubleshoot SQL Server and you have kindly provided some really great queries here. I will certainly be adding then to my script bank.Cheers,

  2. Glenn says:

    Glad you enjoyed them, John

  3. Paul says:

    Glen, great utils, thanks. I thought tempdb was my bottleneck until I ran the above, and turns out it\’s a userdb with 69% stall pct, I need to dig further.

  4. Ron says:

    Hi I ran your queries above and found one of my databases has an io_stall_pct of 97% , any tips for what next to focus on? I found your page because I have been researching how best to layout my database files (# of files, #of files/filegroup) and also playing with MAXDOP setting on a 24 core server. I just this week reduced MAXDOP to 1 and was planning on incrementing it 1 by 1 to find the best setting for us, previously it was set to 0 so SQL managed it. The db showing the 97% stall pct has 1 file group with much heavier usage then the others, and I currently have 3 files in this fg spread over 3 mount points on our san. I thought we only benefited from having as many files in an FG as there were different disks so that’s how I came to the #3 files. Do you think having MAXDOP un-restricted in combination with only 3 files in this heavily used fg could have contributed to the amazingly high stall %? I have a nearly identical system to this on identical hardware with same file structure and this db has a 46% stall pct there, still high but not shockingly high. Thanks in advance for any inputs you can give.

    • Glenn Berry says:

      Ron,

      Is it the data file(s) or the log file that has a 97% io_stall_pct? What kind of workload do you have? What kind of storage is it? What RAID level, how many spindles behind the logical drive? You could run my complete set of diagnostic queries to get a better overall picture of what is causing the pain.

  5. Ron says:

    Glenn thanks so much for responding so quickly. To answer your questions, 1) its the data files with the 97% io_stall 2) we are still in development, and are running ET&L batches almost exclusively in this env. The batches run overnight between work days, they are updating 10’s of millions of rows per batch and the underlying sprocs can run for 45mins to 1.5 hours, this is what I’m working on tuning. 3) we have a Clariion SAN at raid 5, I have an email out on the #/spindles, I don’t know that off the top of my head. I have run pretty much all of your diag queries (thanks for posting these) and I did find something curious. As I mentioned I use file groups and know which one of my fg’s is having these problems, however when I run this query (– Calculates average stalls per read, per write, and per total input/output for each database file.) I see 2 of the 3 files in this FG listed as #’s 1 and 2 in avg_io_stall_ms at 121 and 113.7, but the 3rd file in this Fg is all the way down in the 33rd place with only 13.1 avg_io_stall_ms. I would think that SQL is distributing the data evenly across these files so this is puzzling to me.

  6. Neil says:

    Hi Glen

    A quick question on the isolate top wait states query. You have two columns with a percentage in it. First one pct and the other as running_pct. What is the difference between the two? Should I be looking at just one value or the two in combination.

    Thanks!
    Neil

    • Neil says:

      Hi Glen

      Sorry my bad I just looked over it much more proper. I get it now. Just looking at percentage is the way. I have the top wait time of CXPACKET with over 66% on that. We are running a major ETL system Bascially extracting from SAP into SQL applying fixes and then exporting again. Think once a week truncate table reimport data do validations redo. Now CXPACKET from my understanding is not bad on this type of Batch processing system(OLAP) but could be bad on a Transactional System(OLTP).

      I have noticed severe CPU usage when the CXPACKET wait type is the most common wait type when reports are running. These reports are on BIG tables with the majority tables over 500 million up to 1,3 Billion rows. I suspect the parrallelism in the queries is causing issues as the same tables are used in the reports and I can see queued tasks on the CPU reaching as high as 10 queued tasks. My feeling is that the queries are issues a lot of lock in parrallel which is slowing the other queries down. Since I dont have a development environment I can predict what dropping the MAXDOP settings would do. Your thoughts would be appreciated.

      Regards
      Neil

      • Glenn Berry says:

        Neil,

        How many processor cores do you have, and what kind of processors do you have? Quite often, CXPACKET waits are caused by missing indexes, but that is more common with OLTP workloads.

        If you are under severe CPU pressure, lowering MAXDOP some might help relieve the CPU pressure at the cost of making some of the parallelized queries take longer to run.

      • Neil says:

        Hi Glen.

        I checked for missing indexes. The server is hosted by a rather large international firm whose main business is business machines. They are not exactly forthcomming with the server specifications. It has 8 cores but not sure if it is 2X Quad Core or 4XDual core. My access to the server is quite restricted. So having a hard time getting all the information.
        I suspect that there might be bad or missing statistics on the tables/views being used. This would lead to SQL not creating a optimal plan. Is this assumption correct? I will investigate a little more.

        Regards
        Neil

      • Neil says:

        Hi Glen

        Found the problem. Enforced JOIN Hints. Seems like some bright spark decided that HASH joins should be enforced in the queries. We have had massive issues with space in tempdb as well. So i started the process of elimination. I knew CXPACKET should be beneficial to a OLAP/DW only database so I was not happy just dropping the MAXDOP settings. I checked for missing indexes, old statistics and then finally went into the queries themselves.

        Alarm bells started ringing when I saw SQL saying Warning: Local Join Types Enforced.

        Here is the results(once again proving that SQL is better off making the choices). I had a query with the hash join enforced running more than 45 minutes. I removed the HASH join ran the query again after freeing buffers making sure there was no cached plans etc. The query now runs in 15 seconds! That is like 148 000% improvement.

        Anyway thanks for the queries above though they did not help me solve the problem it was a interesting learning experience.

        Great Blog!

        Regards
        Neil

  7. saggi says:

    Hi In my case TempDB is geeting full (Data File) Fastely , we are using this server for reporting perposes , I checked these DMVs
    Databse Name I/O Stl_read No.of reads Avg_read_stl i/o stl_write no. of writes
    tempdb 824092 221536 3.7 16531884 257911
    avg.write stl
    64.1 17355976 479447 36.2

    After every 3-4 days its getting full (127 GB)

    Inial Size of TempDB is 88GB and Reportdatabase size is 8GB only

    Regards

  8. Pingback: Was beeinflusst die SharePoint Performance (Teil2) | Tobias Wolter IT- Solutions & Consulting

  9. Pingback: What impact can different cursor options have?

  10. Murad says:

    Thanks for these descriptive scripts – really useful.

    Can I just ask – when I ran — Analyze Database I/O, ranked by IO Stall% and subsequent average IO script – I keep getting one database as the culprit with over 93%.

    However when I run — Which queries are causing the most IO operations (can take a few seconds)
    This database does not show up at all in results.

    Why ?

  11. Pingback: A SQL Server DBA myth a day: (12/30) tempdb should always have one data file per processor core | Paul S. Randal

  12. Pingback: Benchmarking: 1-TB table population (part 2: optimizing log block IO size and how log IO works) | Paul S. Randal

Leave a comment