Estimating Data Compression Savings in SQL Server 2008 and 2008 R2

Since I finished upgrading the last of my Production environment to SQL Server 2008 R2 running on Windows Server 2008 R2 a couple of weeks ago, I have been spending some time re-evaluating which indexes can take advantage of data compression in SQL Server 2008 R2. By design, there is no “Compress Entire Database” command in SQL Server 2008 R2. Instead, you need to evaluate individual indexes, based on their size, estimated savings and volatility. The ideal case is a large table that shows very good compression savings that is read-only. A bad candidate is a small table, that does not show much compression savings, with very volatile data.

I quickly got tired of manually running sp_estimate_data_compression_savings with hard-coded parameters for each index in a database, so I decided to write some T-SQL that would somewhat automate the process.  If you set the schema name, table name, and desired data compression type in the variable declarations at the top of the script, you will get some pretty detailed information about all of the indexes in that table.

You should run the entire query at once, after you have supplied your own values. It may take a few seconds to run, depending on your hardware and on how large your tables are.  You could also wrap part of this in a stored procedure that you could call for each table in a database, and have it write the results out to a table that you could easily query later.

-- SQL Server 2008 and R2 Data Compression Queries
-- Glenn Berry 
-- August 2010
-- Twitter: GlennAlanBerry

-- Get estimated data compression savings and other index info
-- for every index in the specified table
DECLARE @SchemaName sysname = N'dbo';                  -- Specify schema name
DECLARE @TableName sysname = N'WhiteListSearchLog';    -- Specify table name
DECLARE @IndexID int = 1;
DECLARE @CompressionType nvarchar(60) = N'PAGE'        -- Specify data compression type (PAGE, ROW, or NONE)

-- Get Table name, row count, and compression status 
-- for clustered index or heap table
SELECT OBJECT_NAME(object_id) AS [ObjectName], 
SUM(Rows) AS [RowCount], data_compression_desc AS [CompressionType]
FROM sys.partitions 
WHERE index_id < 2 --ignore the partitions from the non-clustered index if any
AND OBJECT_NAME(object_id) = @TableName
GROUP BY object_id, data_compression_desc

-- Breaks down buffers used by current database by object (table, index) in the buffer pool
SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName], 
p.index_id, COUNT(*)/128 AS [Buffer size(MB)],  COUNT(*) AS [BufferCount], 
p.data_compression_desc AS [CompressionType]
FROM sys.allocation_units AS a
INNER JOIN sys.dm_os_buffer_descriptors AS b
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p
ON a.container_id = p.hobt_id
WHERE b.database_id = DB_ID()
AND OBJECT_NAME(p.[object_id]) = @TableName
AND p.[object_id] > 100
GROUP BY p.[object_id], p.index_id, p.data_compression_desc
ORDER BY [BufferCount] DESC;

-- Shows you which indexes are taking the most space in the buffer cache

-- Get current and estimated size for every index in specified table
    -- Get list of index IDs for this table
    SELECT s.index_id
    FROM sys.dm_db_index_usage_stats AS s
    WHERE OBJECT_NAME(s.[object_id]) = @TableName
    AND s.database_id = DB_ID()
    ORDER BY s.index_id;

OPEN curIndexID;

FROM curIndexID
INTO @IndexID;

-- Loop through every index in the table and run sp_estimate_data_compression_savings

        -- Get current and estimated size for specified index with specified compression type
        EXEC sp_estimate_data_compression_savings @SchemaName, @TableName, @IndexID, NULL, @CompressionType;

        FETCH NEXT
        FROM curIndexID
        INTO @IndexID;


CLOSE curIndexID;

-- Index Read/Write stats for a single table
SELECT OBJECT_NAME(s.[object_id]) AS [TableName], AS [IndexName], i.index_id,
SUM(user_seeks) AS [User Seeks], SUM(user_scans) AS [User Scans], 
SUM(user_lookups)AS [User Lookups],
SUM(user_seeks + user_scans + user_lookups)AS [Total Reads], 
SUM(user_updates) AS [Total Writes]     
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.[object_id] = i.[object_id]
AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(s.[object_id],'IsUserTable') = 1
AND s.database_id = DB_ID()
AND OBJECT_NAME(s.[object_id]) = @TableName
GROUP BY OBJECT_NAME(s.[object_id]),, i.index_id
ORDER BY [Total Writes] DESC, [Total Reads] DESC;

-- Get basic index information (does not include filtered indexes or included columns)
EXEC sp_helpindex @TableName;

-- Get size and available space for files in current database
SELECT name AS [File Name] , physical_name AS [Physical Name], size/128.0 AS [Total Size in MB],
size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS [Available Space In MB], [file_id]
FROM sys.database_files;

This entry was posted in SQL Server 2008 R2. Bookmark the permalink.

1 Response to Estimating Data Compression Savings in SQL Server 2008 and 2008 R2

  1. Brian Dishaw says:

    This is awesome, thanks for sharing!

    I ran into a small issue with this query as my table name exists outside of the dbo schema. I tweaked this line in order to fix the issue.

    — Get basic index information (does not include filtered indexes or included columns)
    EXEC sp_helpindex @TableName;


    — Get basic index information (does not include filtered indexes or included columns)
    DECLARE @FullName NVARCHAR(776)
    SET @FullName = ( @SchemaName + ‘.’ + @TableName );

    EXEC sp_helpindex @FullName

    I chose 776 because that’s what the parameter of sp_helpindex takes.

    I hope this helps others.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s