Notes on SQL

Random articles from a puzzled DBA

No.

Well, that was easy – I should write more articles that succinctly. Although the real answer, is of course – it depends.

Background

I frequently come across the assertion that the Bulk Insert method is always minimally logged. It certainly can be but requires certain conditions to do so.

I’ve decided to demonstrate the various conditions, with the assistance of fn_dblog – something I covered in the first of an upcoming series of articles about that function.

Introduction

Bulk Insert is one of several commands that can be minimally logged. However, to be minimally logged there are certain criteria that must be met.

What is Minimal Logging?

At a basic level, minimal logging is where the log file doesn’t record details of individual changes to data within a table but merely records the ‘extent’ that was impacted by the minimally logged operation. An extent is 8 contiguous pages of data – 64Kb.

In Full Recovery Mode operation, a transaction is recorded within the log file, that is then actioned in memory and eventually a checkpoint operation is executed, which ‘hardens’ the information to disk.

If a crash should occur before the information is hardened, it is within the log file (if your backup schedule is sufficient or you can execute a ‘tail of the log’ backup’) and therefore a ‘rollforward’ operation can redo the commands when restoring.

Or, you can restore a log file to a specific point in time to remove an update that was actioned but not required.

When a minimally logged operation is completed, it is hardened immediately, via an ‘eager writer’ process – it does not reside in memory first. The log file only contains the details of the extent that was updated. Therefore, less information is written to the log file but oddly enough, the log backup is larger – because it has to write the entire extent to the backup, not just the parts that were updated.

Because a log backup during minimal logging does not contain the ‘granular’ details (just the extents), it is not possible to restore to a point in time with that backup file. Therefore a database should be set to Bulk Logged for the shortest time possible, if you believe that you will need point-in-time recovery.

Requirement for Minimal Logging

First of all, the database needs to be in the Bulk Logged recovery model. This is not possible if the database is taking part in replication, such as being part of an Availability Group (AG) – being part of an AG requires that a database is in Full Recovery at all times.

The table itself cannot be part of a replication process, so this excludes transactional replication and suchlike, as well as AG.

The TABLOCK hint should be used, or trace flag 610 should be set (the trace flag has a slightly different set of requirement to TABLOCK).

If the target table already has data and a clustered index, then minimal logging is not possible – it will be fully logged.

Examples

Preparation

Firstly, we create the test database and a simple test table.

Listing 1: Create the test database and table

CREATE DATABASE BulkTest;
GO

USE BulkTest;
GO

CREATE TABLE dbo.TestInsert
(
    TestData NVARCHAR(300) NULL
);

GO

Next, I used Redgate’s SQL Data Generator, to create a csv file with 1,001 names, but any csv data will do, as long as the data doesn’t exceed the size of the column.

Next, I ensure that the database is in full recovery. Whenever you change the recovery model of a database, the change will not come into effect until a full backup has been executed – so follow this recovery model change with a full backup.

Listing 2: Set Full Recovery and backup

USE [master]
GO
ALTER DATABASE [BulkTest] SET RECOVERY FULL WITH NO_WAIT
GO

BACKUP DATABASE [BulkTest]
TO  DISK = N'C:\Temp\BulkTest.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

Next, I switch of auto statistics, in an effort to keep the ‘clutter’ within the log file down, run a checkpoint and execute fn_dblog(), to see what we have initially.

Listing 3: First look before we start

USE BulkTest;
GO

ALTER DATABASE LogTest SET AUTO_CREATE_STATISTICS OFF;
GO

CHECKPOINT;
GO

--Only the checkpoint is in there now
SELECT *
FROM fn_dblog(NULL, NULL);
GO

Not much in the log file at this point. The column we’re interested in is ‘Log Record Length’, which I will be totalling after each execution, to show how much data is within the log file.

Figure 1: Result from Listing 3

Starting Point

First Example – in Full Recovery Mode

Import the data and check the number of rows within the log file and the total size of the rows in the log file.

Listing 4: Import whilst in Full Recovery

BULK INSERT dbo.TestInsert
FROM 'C:\Temp\NameList.csv'
WITH
(
    ROWTERMINATOR = '\n',
    TABLOCK
);
GO

SELECT COUNT(*) AS [No Of Rows]
FROM fn_dblog(NULL, NULL);
GO

SELECT SUM([Log Record Length]) AS [Total Size]
FROM fn_dblog(NULL, NULL);
GO

So, we now have a set of starting figures to compare against.

Figure 2: Results from Listing 4

Import in Full Recovery

Second Example – in Bulk Logged Mode

Firstly, change the recovery model of the database and execute a Full backup (to complete the change in mode). In addition, we need to execute a Log backup in order to clear the log file with a checkpoint later.

Listing 5: Change to Bulk Logged recovery

USE [master]
GO
ALTER DATABASE [BulkTest] SET RECOVERY BULK_LOGGED WITH NO_WAIT
GO

BACKUP DATABASE [BulkTest]
TO  DISK = N'C:\Temp\BulkTest.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

BACKUP LOG [BulkTest]
TO  DISK = N'C:\Temp\BulkTest_Log.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

Now we truncate the target table, execute a checkpoint and run the Bulk Insert again.

Listing 6: Import with Bulk Logged recovery model

USE BulkTest;
GO

TRUNCATE TABLE dbo.TestInsert;
GO

CHECKPOINT;
GO


BULK INSERT dbo.TestInsert
FROM 'C:\Temp\NameList.csv'
WITH
(
    ROWTERMINATOR = '\n',
    TABLOCK
);
GO

SELECT COUNT(*) AS [No Of Rows]
FROM fn_dblog(NULL, NULL);
GO

SELECT SUM([Log Record Length]) AS [Total Size]
FROM fn_dblog(NULL, NULL);
GO

The number of rows is slightly larger but can be ignored for this demonstration – the fun to be had when a database is not in Simple Recovery. the important part to note is that the volume of data is substantially less.

Figure 3: Result from listing 6

Reduced size of data

Third Example – Bulk Logged Without TABLOCK

Now, backup the log file and clear with a checkpoint, truncate the table and start again. This time, the TABLOCK clause has been removed. Trace Flag 610 is not set.

Listing 7: Remove TABLOCK

USE BulkTest;
GO

TRUNCATE TABLE dbo.TestInsert;
GO


BACKUP LOG [BulkTest]
TO  DISK = N'C:\Temp\BulkTest_Log.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

CHECKPOINT;
GO


BULK INSERT dbo.TestInsert
FROM 'C:\Temp\NameList.csv'
WITH
(
    ROWTERMINATOR = '\n'
);
GO

SELECT COUNT(*) AS [No Of Rows]
FROM fn_dblog(NULL, NULL);
GO

SELECT SUM([Log Record Length]) AS [Total Size]
FROM fn_dblog(NULL, NULL);
GO

The size of the data in the log file is substantially larger than the data from Listing 6.

Figure 4: Result from Listing 7

Much more data without TABLOCK

Fourth Example – Bulk Logged with Existing Data

Effectively repeating the execution oif Listing 6, but resetting the log and target table first.

Listing 8: Clear the table and log and re-run Bulk Insert with TABLOCK

USE BulkTest;
GO

TRUNCATE TABLE dbo.TestInsert;
GO

BACKUP LOG [BulkTest]
TO  DISK = N'C:\Temp\BulkTest_Log.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

CHECKPOINT;
GO

BULK INSERT dbo.TestInsert
FROM 'C:\Temp\NameList.csv'
WITH
(
    ROWTERMINATOR = '\n',
	TABLOCK
);
GO

SELECT COUNT(*) AS [No Of Rows]
FROM fn_dblog(NULL, NULL);
GO

SELECT SUM([Log Record Length]) AS [Total Size]
FROM fn_dblog(NULL, NULL);
GO

This again returns us to a reduces amount of data within the log file.

Figure 5: Result from Listing

Back to the Minimally Logged Insert

Now we’ll backup the log, checkpoint but leave the data in the table.

Listing 9: Append data

USE BulkTest;
GO


BACKUP LOG [BulkTest]
TO  DISK = N'C:\Temp\BulkTest_Log.bak'
WITH NOFORMAT,
     NOINIT,
     NAME = N'BulkTest-Full Database Backup',
     SKIP,
     NOREWIND,
     NOUNLOAD,
     STATS = 10;
GO

CHECKPOINT;
GO


BULK INSERT dbo.TestInsert
FROM 'C:\Temp\NameList.csv'
WITH
(
    ROWTERMINATOR = '\n',
    TABLOCK
);
GO

SELECT COUNT(*) AS [No Of Rows]
FROM fn_dblog(NULL, NULL);
GO

SELECT SUM([Log Record Length]) AS [Total Size]
FROM fn_dblog(NULL, NULL);
GO

SELECT COUNT(*) AS [No of Rows in TestInsert]
FROM dbo.TestInsert;
GO

The output show a small amount of data in the log file, and the table has twice as many rows – so it was appended.

Figure 6: Results from Listing 9

Appended rows, minimally logged

Fifth Example – Existing Data and a Clustered Index

Create a clustered index on the test table.

Listing 10: Create clustered index

CREATE CLUSTERED INDEX idx_TestInset ON dbo.TestInsert (TestData);
GO

Now, repeat the code in Listing 9.

Figure 6: Results from Listing 9 after adding a clustered index

Substantially more data in the Log file

Conclusion

Yes, a Bulk Insert command can be minimally logged, as long as the right conditions are met.

References

Gail Shaw’s outstanding article on bulk-logged recovery

Trace Flags

Technet article on minimal logging

SSC article on minimal logging with a reference grid

Prerequisites for Minimal Logging in Bulk Import

Featured Image Credit

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: