Gradually Deleting Data in SQL Server

If you have a situation where you have a very large table in SQL Server, where you need to periodically delete tens of millions of rows of data, there are several ways to do it.

If you have a maintenance window (or your database is not required to be available 24 x 7 x 365), you can (and probably should) just delete all of the rows in one shot, using a set based operation. This would be the quickest way to delete a large number of rows, but you will probably end up getting lock escalation to a table lock, which essentially makes the table unavailable during the delete.

Another issue to consider is whether you have transactional replication on that table, and/or you have database mirroring in place on the database. Deleting a large number of rows from a table will generate a lot of log activity, which may cause transactional replication or database mirroring to fall behind. This of course depends on your hardware and network infrastructure. You also want to keep an eye on your transaction log, to make sure it is not filling up and having auto-grow kick in.

A safer, but much more time consuming way to delete millions of rows is to use some sort of looping mechanism, where you gradually delete a fairly small number of rows in a loop, to slowly nibble away at the table. This will take much longer than a set based operation, but, if done properly, will not cause concurrency problems, and will not overwhelm transactional replication or database mirroring.

At any rate, I recently faced a situation like this, so I decided to show one method to deal with it pretty easily. In this case, we want to delete every row that has a TransactionId lower than a certain number. We are going to delete 500 random rows that qualify in each delete, and loop 5000 times, with a slight delay between each delete. This will delete 2.5 million rows each time the query is run.  You can obviously adjust these numbers and the delay time so that it works best in your environment. You could also wrap this into a stored procedure.

-- Gradual Delete Sample
-- Glenn Berry 
-- August 2011
-- https://sqlserverperformance.wordpress.com/
-- Twitter: GlennAlanBerry

SET NOCOUNT ON;

-- Check space used by table before we begin
EXEC sp_spaceused N'dbo.BigLoggingTable';

-- Declare local variables
DECLARE @NumberOfLoops AS int;
SET @NumberOfLoops = 5000;

DECLARE @CurrentLoop AS int;
SET @CurrentLoop = 0

DECLARE @DeleteSize bigint;
SET @DeleteSize = 500;

DECLARE @HighWaterMark bigint;
SET @HighWaterMark = 382989078;

WHILE @CurrentLoop < @NumberOfLoops
    BEGIN
        -- Just delete any xxx rows that are below the HighWaterMark
        DELETE 
        FROM dbo.BigLoggingTable
        WHERE TransactionId IN 
            (SELECT TOP(@DeleteSize) TransactionId 
             FROM dbo.BigLoggingTable WITH (NOLOCK)
             WHERE TransactionId < @HighWaterMark);
             
        WAITFOR DELAY '00:00:00:50';
          
        SET @CurrentLoop = @CurrentLoop + 1;
    END

-- Check space used by table after we are done    
EXEC sp_spaceused N'dbo.BigLoggingTable';
This entry was posted in SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, SQL Server Denali. Bookmark the permalink.

23 Responses to Gradually Deleting Data in SQL Server

  1. Bender says:

    I’ve had to do this on many very large tables before. I’ve used this process before and while it works very well, I’ve found the deeper it goes the slower it gets. I manually updated the statistics on every, say, 200th iteration and that seemed to keep it more consistent. I’m guessing the WAITFOR is probably giving SQL time to do the same. GUESSING.

    When it comes to very very large tables, which in my case is logging tables (I don’t work with good citizens who clean up after themselves) I usually just bulk copy the data I want out, truncate the table and BULK INSERT the data back in. Again, probably not the best solution for every situation, but it’s the fastest one I’ve come up with.

    B

  2. I’ve had to do this on many systems too. Worth noting: table partitioning helps this scenario quite a lot, if it’s an option. EE only, and the underlying tables have to be set up for partitioning, so it doesn’t always work – but when it does, you can just switch and truncate.

    Also, for non-partitioned tables, DELETE TOP (n) … works within the same sort of looping structure. I have some of these that basically do: loop / delete top (n) … where … / while @@rowcount > 0, to keep on deleting until no rows qualify any longer.

  3. Manjot says:

    Hi,
    I always delete huge amount of data by dividing it into smaller chunks as you wrote above but last time when I was running this on a server, it still tend to grow the transaction log file (database was in simple recovery mode). I checked the sys.dm_os_waiting_tasks table and saw that the checkpoint process has been waiting for a long time. So, I just waited till the checkpoint wait disappeared and then re-ran the delete process. So I concluded that checkpoint needed to catchup. How would you address this issue without waiting?

  4. Jason Crider says:

    Thanks for the very informative post Glenn. I’m really enjoying all the great info putting out there. Any new word on getting your book on the Kindle?

    Your post reminded me about a situation that happened to me and I mentioned you in my blog post @ http://www.jasoncrider.com/blog/archives/2011/08/16/deleting-data-in-small-chunks-on-sql-server/.

    Keep up the more than excellent work.

  5. Pingback: Something for the Weekend – SQL Server Links 19/08/11

  6. Randy says:

    I’ve done something similar using a try / catch to keep the loop going. I also set deadlock priority low and was able to run it even during the business day. Can take a very long time of course, but I always want my process to “die” first. At least it allows me to stop doing stuff like this every single weekend. 🙂

  7. Charles Kincaid says:

    Then there is the issue of the clustered index. Likely whatever key you choose to purge your table won’t match the clustered key on that table. If it does then you might dodge “unused is taking up my space” syndrom. If any row on the leaf of a clustered indes is good then the whole page stays. If your clustered index is some IDENTITY based thing then you wind up with index pages more full of hole than a government promise.

    Eventually you will have to resort to rebuilding the clustered index. This technique makes the eventual farther in the future.

  8. Vanessa says:

    Hey, nice article.
    Can we just take advantage of some SQL management software to help us do the gradually data deleting work? Just saw an article about SQL restore and backup with other features: http://www.todo-backup.com/products/features/sql-backup-and-restore.htm

  9. r_guy says:

    If you are deleting the entire table, truncate. If you are keeping a lot less rows than deleting, copy to temporary table, drop and recreate table if feasible or truncate table and copy rows back. Loops are a last alternative.

    • Glenn Berry says:

      Yes, TRUNCATE is much better if you are deleting the entire table. Copying the table to another table is not such a good idea if you have hundreds of millions of rows and you want to delete, say, 10 million rows. Remember, the whole point here is to delete lots of rows while the table is online and available 24 x 7

  10. Slicky says:

    Hi Glenn. Ive heard the sp_spaceused is unreliable method to calculate a table’s size. Reading about it on an blog from an expert like you, I’m thinking that might not be true. Can you confirm?

  11. shaun5stu says:

    I had to develop something similar to this for use on a group of five financial services tables that had to be up 24/7. Locking was a huge issue for me – locks of even 10 seconds were not tolerated. I found a great explanation of my options at http://www.sqlsoldier.com/wp/sqlserver/sqluvldbweekarchivingandpurgingdata.

    Merrill Aldrich: Note that using TOP will not reduce locks. This is mentioned in the above referenced blog post and also confirmed by my experience. Rather than using DELETE TOP on each of the tables, I did a SELECT TOP to insert the IDs of the records I wanted to delete into a temp table, then deleted based on joining to that table.

    • JohnCampbell says:

      That is the way that I have been doing large deletes for quite a few years. I build a temp table of the key values from the table for the records that I am going to delete, select the top xxx number of records from the temp table, delete the working records and then the temp table records, wait x.x seconds and do again within a while loop. The number of records to be deleted at one time needs to be adjusted so that the bite is big enough to be meangiful, but not take an excessive amount of time, since this is locking the table that you are working on. The x.x. seconds for the wait adjustment is to allow all of the backed up records to process, I generally start with 1 second here and go up, monitoring the locks in the database. I am probably taking smaller bites for each delete than others, but generally the users end up asking me when I am going to start on the delete operation after I am finishing up. Getting the users to agree on what to delete is the biggest problem in my experience 🙂

  12. Great Post Glenn. I outlined a similar technique, which I call ‘Nibbling Deletes’ in the following video (for anyone who’d like to see a bit more background/info on how this works):
    http://www.sqlservervideos.com/video/nibbling-deletes/

  13. K Lindner says:

    My spin on it doesn’t implement a WAITFOR but I like the idea! I built mine to track execution time so that I could add an abort for modifications taking longer than n seconds.

    If you’re using an update, you’ll have to build exclusionary logic in your where clause to prevent endless looping (e.g., set x=1 where x1).

    declare @iRowcount bigint,
    @dtThen datetime,
    @iElapsed bigint,
    @iElapsedThreshold bigint,
    @iTotalRows bigint

    select @iRowCount = 10000,
    @iElapsedThreshold = 30, — max seconds for a single batch to complete
    @iTotalRows = 0

    set rowcount 10000 — numbers of rows to modify in each iteration
    set nocount on

    while @iRowcount > 0
    begin
    begin tran batchupdate
    select @dtThen = getdate()

    /*
    ** your DML here
    */

    select @iRowcount = @@ROWCOUNT
    commit tran batchupdate
    select @iTotalRows = @iTotalRows + @iRowcount
    select @iElapsed = datediff(ss,@dtThen,getdate())

    print ‘Total Rows Affected: ‘ + convert(varchar(30), @iTotalRows) + ‘ Row(s) Affected: ‘ + convert(varchar(30), @iRowCount) + ‘ Elapsed: ‘ + convert(varchar(30),@iElapsed)
    if @iElapsed > @iElapsedThreshold
    begin
    print ‘Aborting due to threshold violation.’
    select @iRowcount = 0
    break
    end
    end
    set nocount off
    set rowcount 0
    go

Leave a comment