Testing Native Backup Compression in Combination with SQL Server 2008 R2 Data Compression

SQL Server 2008 introduced native backup compression and data compression as features that were only available in the Enterprise Edition SKU. In SQL Server 2008 R2, native backup compression was added to the Standard Edition SKU, while data compression remains an Enterprise Edition feature, although it was improved with the addition of Unicode compression.

I have been using both native backup compression and data compression with SQL Server 2008 and 2008 R2 in Production with very good results for quite a while. I thought it would be interesting to run a few somewhat formal tests to investigate how well these two features work when they are both used together. I wanted to use some actual production data rather than loading up an empty table with bogus data using Replicate. I happened to have a backup of one of my smaller Production databases from a few months ago on an external USB drive, so this allowed me to do some testing with that real data.

To make things relatively simple, I did a SELECT INTO of a 151 million row table that went to a new, empty database on the same instance on my teaching laptop. Then, I created an uncompressed clustered index on the heap table that the SELECT INTO statement had created. I took a full compressed backup of that new database and then restored it twice to create two new copies with different names, but the same data. I then rebuilt the clustered index on the single table on each copy of the database, using either Row compression or Page Compression. After doing this, I had a lot of empty space in the data file of the two databases that were using data compression, so I decided to risk the wrath of Paul Randal, and shrink the data file on each of those databases. Shrinking data files is actually a very evil thing to do, since it severely fragments your indexes. Sure enough, after I shrank the data files on the two databases, the single clustered index had over 99% fragmentation. Running ALTER INDEX REORGANIZE pretty effectively removes the fragmentation though. After all of these gyrations on my poor Toshiba Portege R705 laptop, I had three databases that were identical except for the type of data compression used on the clustered index of the single table in each database. Here are the database sizes for the three databases:

Database Name               Data File Size      Log File Size

NoCompressionTest          18.6GB               512MB

RowCompressionTest        10.6GB               512MB

PageCompressionTest         4.1GB               512MB

My laptop has 8GB of RAM,and a 500GB 7200rpm Seagate Momentus XT hybrid hard drive. It also has a 2.27GHz Intel Core i3 350M processor with two cores plus hyper-threading. The Core i3 does NOT have Turbo boost.

Here is what the table schema looks like:

-- Table and index schema
CREATE TABLE [dbo].[OnlineSearchHistoryNonCompressed](
    [OnlineSearchHistoryID] [int] IDENTITY(1,1) NOT NULL,
    [SearchTerm] [nvarchar](50) NULL,
    [NumItemsRequested] [int] NULL,
    [NumItemsReturned] [int] NULL,
    [SearchElapsedTime] [int] NULL,
    [SearchDateTime] [smalldatetime] NOT NULL,
    [SearchSPName] [nvarchar](50) NULL,
 CONSTRAINT [PK_OnlineSearchHistoryNonCompressed] PRIMARY KEY CLUSTERED 
(
    [OnlineSearchHistoryID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
 ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, DATA_COMPRESSION = NONE) ON [PRIMARY]
) ON [PRIMARY];
GO

Backing up a “large” database from the C: drive of a laptop to another directory on the same C: drive is not a best case scenario, but it highlights a common problem where disk I/O is a big bottleneck during the backup process. Using backup compression compresses the data before it is written out to the backup file(s), so it usually reduces the elapsed time for the database backup and reduces the disk space required for the backup file(s), at the cost of some added CPU utilization, which is typically in the 10-15% range. Here are the summarized results from doing a full backup of each database, without and with backup compression.

NoCompressionTest w/o Backup Compression     18.7GB          725 seconds

NoCompressionTest w/Backup Compression         3.9GB           246 seconds

RowCompressionTest w/o Backup Compression    10.6GB         287 seconds

RowCompressionTest w/Backup Compression        3.2GB          165 seconds

PageCompressionTest w/o Backup Compression     4.1GB          89 seconds

PageCompressionTest w/Backup Compression        2.5GB          71 seconds

In each case, we see both a disk space and elapsed time benefit from using both data compression and backup compression, without taking a prohibitive CPU utilization hit. Of course, the structure of this table is in our favor, with two NVARCHAR(50)  and several INT columns.  Here are the the detailed results:

-- Testing Backup Compression with Data Compression
-- Glenn Berry 
-- August 2010
-- http://glennberrysqlperformance.spaces.live.com/
-- Twitter: GlennAlanBerry

-- All three databases have a single table with a clustered index
-- The table has 151 million rows of actual production data


-- Test 1
-- Full backup of NoCompressionTest database without Backup compression
-- (725 seconds, 18.7GB backup file)
BACKUP DATABASE [NoCompressionTest] 
TO  DISK = N'C:\SQLBackups\NoCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'NoCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 1
GO

-- Test 1 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            611.18            44.36    44.36
--ASYNC_IO_COMPLETION     459.14            33.33    77.69
--BACKUPIO                302.12            21.93    99.62

-- Test 2
-- Full backup of NoCompressionTest database with Backup compression
-- (246 seconds, 3.9GB backup file) 
BACKUP DATABASE [NoCompressionTest] 
TO  DISK = N'C:\SQLBackups\NoCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'NoCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION,  STATS = 1
GO

-- Test 2 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            255.81            31.09    31.09
--ASYNC_IO_COMPLETION     245.93            29.89    60.97
--BACKUPIO                232.96            28.31    89.28
--IO_COMPLETION           87.53             10.64    99.92

-- ************* End of NoCompression tests

-- Test 3
-- Full backup of RowCompressionTest database without Backup compression
-- (287 seconds, 10.6GB backup file, 4.76 compression ratio for backup)
BACKUP DATABASE [RowCompressionTest] 
TO  DISK = N'C:\SQLBackups\RowCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'RowCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 1
GO

-- Test 3 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            368.32            42.92    42.92
--ASYNC_IO_COMPLETION     285.90            33.31    76.23
--BACKUPIO                200.66            23.38    99.61

-- Test 4
-- Full backup of RowCompressionTest database with Backup compression
-- (165 seconds, 3.2GB backup file, 3.28 compression ratio for backup)
BACKUP DATABASE [RowCompressionTest] 
TO  DISK = N'C:\SQLBackups\RowCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'RowCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION,  STATS = 1
GO

-- Test 4 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            506.86            51.12    51.12
--ASYNC_IO_COMPLETION     324.98            32.78    83.90
--BACKUPIO                140.87            14.21    98.10
--IO_COMPLETION           17.87              1.80    99.91

-- ************* End of RowCompression tests

-- Test 5
-- Full backup of PageCompressionTest database without Backup compression
--(89 seconds, 4.1GB backup file)
BACKUP DATABASE [PageCompressionTest] 
TO  DISK = N'C:\SQLBackups\PageCompressionTest_UncompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'PageCompressionTest-Full Uncompressed Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 1
GO

-- Test 5 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            121.53            45.03    45.03
--ASYNC_IO_COMPLETION      88.46            32.78    77.80
--BACKUPIO                 54.34            20.13    97.94
--IO_COMPLETION             5.24             1.94    99.88


-- Test 6
-- Full backup of PageCompressionTest database with Backup compression
-- (71 seconds, 2.5GB backup file, 1.64 Backup compression ratio for backup)
BACKUP DATABASE [PageCompressionTest] 
TO  DISK = N'C:\SQLBackups\PageCompressionTest_CompressedBackup.bak' WITH NOFORMAT, INIT,  
NAME = N'PageCompressionTest-Full Compressed Database Backup', SKIP, NOREWIND, NOUNLOAD, COMPRESSION,  STATS = 1
GO

-- Test 6 Top Waits
--wait_type                wait_time_s        pct        running_pct
--BACKUPBUFFER            298.00            54.63    54.63
--ASYNC_IO_COMPLETION     179.12            32.84    87.47
--BACKUPIO                 58.79            10.78    98.25
--IO_COMPLETION             8.44            1.55     99.80

Thanks to Moore’s Law, we increasingly have excess CPU capacity to burn, which can be used to reduce I/O pressure on our system, whether it is a single drive laptop or a large production database server. Powerful new CPUs are much less expensive than added I/O capacity in most cases.

This entry was posted in SQL Server 2008 R2. Bookmark the permalink.

Leave a comment