Replication Logging

20 Nov Monitor

At SQLPASS 2015 I attended a presentation by Kendal Van Dyke, in which he told us that it was possible to log the actions of Replication to an output file.

This was news to me, so I just had to have a closer look.

The Setup

This will use the same Publication that I created in my earlier articles (Subscription and Distributor), which has been running on my test instances ever since.

The Commands

Replication is controlled via Agent Jobs and this is where the output commands are used. Every job involved in Replication (Log Reader, Queue Reader and Distributor) can make use of them.

-Output [Filename] is the name of the text file that the output will be written to. If the file does not exist then it will be created, otherwise it will be appended to.

-OutputVerboseLevel [0|1|2] is the level of output required, with 0 being the lowest and 2 the highest.


Implementing this is quite easy – located the Agent jobs for the Publication of interest and add the commands to end of the job’s command line.

The appropriate Log Reader Agent and Queue Reader Agent can be found via the Replication Monitor. Open Replication Monitor, from any of the Publisher, Distributor or Subscriber servers and locate the ‘Agents’ tab from the appropriate Publication. From the ‘Agent types’ drop-down select either ‘Log Reader Agent’ or ‘Queue Reader Agent’, then right-click on the entry and select ‘Properties’.


This will show the same control that selecting the job under SSMS ‘SQL Server Agent/Jobs’ and ‘Properties’ of the appropriate Job would have shown, but removes any doubt you have located the correct Log Reader Agent.

Of course, if you know the job involved then you can just locate each from the SSMS ‘SQL Server Agent/Jobs’ list on the appropriate servers and update them directly. In my Pull Subscription I have to locate the Distributor Job myself within SSMS – it doesn’t appear as an option via Replication Monitor.

Once the parameters have been added the jobs need to be stopped and started again, so the new command-line can be read.

The Output

With OutputVerboseLevel set to 0 only basic details are written to the text files – various settings of the publication, server names, job names and suchlike. The Distributor produces these details every time it runs, so even this can me quite a large document given enough time. This is useful for getting error messages with basic history details.

OutputVerboseLevel of 1 also shows statistics, in addition to connection details and other information. This of course increase the output from any job using this setting.

OutputVerboseLevel of 2 is the most detailed level of output. This shows the stored procedure calls, batch details, connections, disconnections and so on. It is very useful for a short period of time but quickly fills drive space. Use this when you need a great deal of detail for a short period of time.

Monitoring Versioning

11 Sep

Optimistic locking requires the use of row versioning, where a copy of the data about to be modified is stored in TempDB.
The advantage of this approach is that the number of locks are reduced and the opportunities for blocking are also reduced.

The downside is that the resources required for data modifications are increased, because the data requires a version creating before modification and pointers need to be generated to ensure the correct versions are in use. Retrieving the correct version also incurs an additional overhead, because the version store is a linked list and therefore a number of rows may need to be traversed before finding the data required.

Because TempDB contains the row versions for all databases on an instance it needs to have enough space to accommodate these. Should TempDB run out of space then an attempt will be made to shrink the version store. This can result in error 3967, where a transaction that has yet to generate a version has been rolled back, or error 3966 where a transaction cannot read the row version. Write operations will continue, but error 3959 will be written to the error log because the write could not generate a row version.

So it really pays to ensure there is enough space for TempDB.

There are several ways to see the impact of using a version store – Performance Monitor and DMVs.

Basic Example
In this example I’m using three perfmon counters, all located under the instance of the appropriate SQL server –
Transactions\Version Generation rate (KB/s),
Transactions\Version cleanup rate (KB/s) and
Transactions\Version store size (KB).

On the instance used in this example the MSDB database has row-versioning enabled because it has snapshot isolation set. Therefore there is always some background activity in the version store.


Approximately every minute an automatic job executes which cleans out entries from the version store that are no longer required and is monitored under ‘Version cleanup rate (KB/s)’.
Similarly, the two DMVs ‘sys.dm_tran_top_version_generators’ and ‘sys.dm_tran_version_store’ show a small amount of activity within the version store.


Because select queries don’t generate row versions, executing a simple update against a test table is all that is required to demonstrate the potential impact upon the TempDB.

USE IsolationLevels;


UPDATE dbo.test4
SET Test_Value = 1;

A spike can be seen in the version generation and an increase in the size of the version store.


In addition, there are a number of entries reported by the DMVs, filtered on the database in question.


The cleanup process (the green line in the following screenshot) then removes any entries from the version store that are too old to be of use.


And in this example, the DMVs return no results after the cleanup.


Now I’m going to loop the update and run a select query, which also loops. This will increase the number of versions created and acquired, showing the impact via Perfom and the DMVs.

The update

WHILE 1 =1 
	UPDATE dbo.test4
	SET Test_Value = Test_Value + 1
		WHERE Index_Column IN ('A', 'B')
		  AND ID < 100000;

	WAITFOR DELAY '00:00:01'

Will run until terminated, generating a large number of versions.

The select will run constantly too, showing the data it is reading.

WHILE 1 = 1


	FROM dbo.Test4
	WHERE Index_Column IN ('A', 'B')
	  AND ID < 100000


Perfmon shows a gradual increase in the version store size, although the rate of creation is low (due to the WAITFOR DELAY).
When the cleanup executes it does not remove all of the versions, because the SELECT statement is using some of those versions.


In addition, the DMVs return a larger number of rows and takes longer to do so. This extra time can be excessive where the version store has become large.



Optimistic locking can reduce the overhead of locking and blocking, but it incurs its own overheads which need to be considered.

Further Reading

Understanding Row Versioning-Based Isolation Levels

Row Versioning Resource Usage

Managing TempDB in SQL Server: TempDB Basics (Version Store: Growth and removing stale row versions)

Transaction Related Dynamic Management Views and Functions

Read Committed Snapshot – Another Optimistic Flavour

26 Aug

In a previous article (here) I described how the isolation level read committed works, with some examples. There is also an optimistic version of this – read committed snapshot, which uses row versions.
The behaviour of read committed and read committed snapshot is similar and repeating the examples used previously will show the similarities and the differences.

First of all, the database needs to be configured for snapshot isolation.

ALTER DATABASE IsolationLevels

As with the article on read committed, there are 5,000,000 rows with the ‘ID’ column starting from 10 and a value of 1 in ‘Test_Value’ for each row.


This code is executed again and as soon as it starts the row with ‘ID’ of 10 is moved to the end of the index, in another tab.

UPDATE dbo.Test2
SET id = id + 6000000
WHERE id = 10;

This time the result is slightly different. The result shows that move of the row with ‘ID’ 10 to 6000010 has not affected the first SELECT. This is because it used a snapshot of the data, taken when the statement started (not when the transaction started), so the move had no impact upon this version of the data.


Similarly, moving the last row in the index to the start of the index does not have the impact it did with read committed isolation, for the same reason. Having reset the data and executed the appropriate UPDATE does not have the same affect, because it processed a snapshot of the data.

UPDATE dbo.Test2
SET id = 9
WHERE id = 5000009;


For the same reason, dirty reads cannot happen with read committed snapshot, because it is using as snapshot of the data taken when the statement started.

A smaller demonstration
In the read committed isolation level article, an UPDATE was executed to cause a lock, thereby blocking a subsequent SELECT.
Trying the same demonstration here results in different behaviour.

1000 rows, with an ‘ID’ column and ‘Test_Value’ column matching in values.


Executing an UPDATE in a separate window without a COMMIT/ROLLBACK will lock the selected row.


UPDATE dbo.Test2
SET Test_Value = 1000
WHERE id = 50;

And rerunning the query with the two SELECT statements would wait for the release of the lock from the UPDATE, if this were read committed isolation level.


However, with read committed snapshot the row with the ‘ID’ of 50 is unchanged, and the SELECT was not blocked, because a snapshot was used by the transaction.

A different demonstration, to show that each SELECT within a transaction uses a different snapshot.
Two identical SELECT queries, separated by a 10 second delay.


Before the WAITFOR command completes an UPDATE is executed that will move a row from the start of the index to the end.


UPDATE dbo.Test2
SET ID = 1001
WHERE ID = 10;


Looking at completed initial transaction, the moved row is shown in the second SELECT but not in the first, so the second SELECT read a new snapshot, showing the moved data.


So, as with read committed isolation, multiple queries within the same transaction may return different results, even when the criteria are identical.

Snapshot Isolation – Let’s Get Optimistic

18 Aug copycat

The previous articles on isolation levels were using pessimistic locking. Two further isolation levels exist that use optimistic locking and are row-versioning based.

Snapshot Isolation
With snapshot isolation level no locks are placed upon the data when it is read and transactions that write data don’t block snapshot isolation transactions from reading data. Instead, when the first statement within a transaction is started, a version of the data it requires is stored in tempdb along with a sequence number. It uses this sequence number to control access to the consistent version of the data for that transaction.
Snapshot isolation, like the serializable isolation level, does not allow dirty reads, lost updates, phantom rows or nonrepeatable reads. However, snapshot isolation does not use locks to achieve this. Instead it has a consistent version of the data, taken at the start of the transaction and if an update to this data clashes with an update performed elsewhere then the transaction is rolled back.
To set snapshot isolation requires setting a database level property.


This enables the row versioning mechanism within the database. Versioned rows will be stored even if the isolation level specified for the query is pessimistic. However, you still need to specify the snapshot isolation level for the query in order to use versioned data.

Example 1 – updating a row
Using the same table used in the serializable example, transaction 1 will use read committed isolation to read and update the value with one row.
At the same time a snapshot isolation level transaction will select the same row and attempt an update.
In transaction 1


Note the absence of a COMMIT or ROLLBACK at this point, locking this row. It will show the updated value of 2, for the column ‘Test_Value’.
In transaction 2


Using snapshot isolation, this shows the value of 1 for ‘Test_Value’, because at the time this transaction started the update in transaction 1 had not been committed.
If we now COMMIT transaction one and then re-run the SELECT query within transaction 2 the value is unchanged.


This transaction is still reading the date from the version store, to keep the view of this transaction consistent.
Because the row with ‘ID’ of 9 has now been updated elsewhere, an attempt by transaction 2 to update this row will result in an error message and a rollback of the transaction.


The entire error message is
Msg 3960, Level 16, State 2, Line 1
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table ‘dbo.Test4’ directly or indirectly in database ‘IsolationLevels’ to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

Of course, re-running transaction 2 now will show the updated value and an update will be allowed because nothing else will be in the process of updating it.


Example 2 – inserting a row
Transaction 1 selects the first few rows, transaction 2 inserts a row within the same range that transaction 1 is using and transaction 3 then selects the first few rows.
Transaction one selects all rows where the ‘ID’ column has a value less than 10, and retrieves 6 rows.


Transaction 2 inserts a new row, with an ‘ID’ value of 1, which should make it within the range of the transaction 1 query.


Re-run the SELECT query within transaction 1 and the results are unchanged – it is using the data from the version store.


Repeat this query in transaction 3 and the extra row is also selected, as it is using the latest version of the data.


Unlike the serializable isolation level it has been possible to insert a row within the range that transaction 1 had selected, but transaction 1 will never show this new row. It will keep a consistent version of the data until the transaction has completed or rolled back.

Example 3 – deleting a row
As expected, the behaviour is the same as with the insertion – transaction 1 will read a range of data.


Transaction 2 will delete one of the rows that transaction 1 had selected.


Repeating the select within transaction 1 will give the same results, even though transaction 2 has since removed a row.


Executing transaction 3 after the deletion will show the latest data.


The Overhead
Setting the database to ALLOW_SNAPSHOT_ISOLATION starts the generation of row versions, even for transactions that do not use snapshot isolation. This command will not complete until all currently executing transactions have completed, so may take a while to return ( .
Versioning adds 14 bytes to every row of the database as they are accessed, so it is essential to take this additional storage into account.
The versioned data is stored in tempdb, so this can grow by a large amount for databases that are heavily used. In addition, this store is a linked list of related versions, so transactions may have to traverse a long linked list if there are transactions that have been open for a long time. This of course adds to the processing time of a transaction. There is a clean-up process that runs frequently, to clear this list but it may not be able to remove all entries, depending upon how heavily the database is being used.
A range of DMVs and functions are available to monitor the behaviour of versioning within a database – but if the version store is large these can also take a while to process.

Snapshot isolation is not suitable for databases where data updates are frequent, as it will have the potential to raise a large number of conflicts. It is more suited to databases that are mainly read, with data inserted or deleted but few updates.

Serializable – Not One For Sharing

12 Aug i-m-not-sharing-74817

In the previous article it was shown that phantom reads and skipped rows were possible with the repeatable read isolation level.
Serializable is the highest isolation level and will not allow phantom reads or skipped rows.

Continue reading

Repeatable Read – still no guarantees

7 Aug funny-mine-food-bird-claw

In my previous article I showed how read committed would allow rows to be updated that it had already processed in a query, because it releases locks as soon as it has processed that row or page (depending upon the type of lock taken). A second query using the same criteria would then show different results from the first.
Repeatable read isolation level doesn’t allow this type of action to occur, because it retains the locks until the transaction has completed.
Continue reading

Read Committed – not getting what you expect

23 Jul puzzle

In a previous article I demonstrated how use of READUNCOMMITTED (or the NOLOCK hint) can result in erroneous data, due to page splits. Examples abound of the more common issue with this isolation level, which occur when reading a row that is being updated in another transaction and then rolled back. These so-called ‘dirty reads’ cannot happen with the default isolation level of READ COMMITTED.

However, it is still possible to alter data within a table that is being read with READ COMMITTED, so that the results of that transaction do not reflect the actual state of the data when the transaction has completed.
For this demonstration I’ll create a table that contains an ID column and an integer column, with a clustered index on the ‘ID’ column.

CREATE TABLE [dbo].[Test2](
	[Test_Value]	INT NOT NULL,


For reasons that will become apparent I’ll start the ID from 10, incremented by 1 for each row and store 1 in ‘Test_Value’ against each row.
With nothing else executing against that table there are 5,000,000 rows and the sum of ‘Test_Value’ is 5,000,000:


The two queries within this transaction are looking at the same data.

Now I will repeat this query, and in a separate tab I will update the row with ID of 10 to move it to the end of the index, as soon as the first query has started:

UPDATE dbo.Test2
SET id = id + 6000000
WHERE id = 10;

The results for this an extra row in the first SELECT:


And if I scroll to the end of that result, the row that was shown with an ID of 10 is also shown with an ID of 6000010.


So, the first SELECT query has looked at the same row, before and after it was moved. However, the second SELECT query, because it read the data after all of the updating, has returned the correct information.

Resetting the data, it is also possible to ‘lose’ a row, when a row at the end of the index is moved to the start of the index, during the first SELECT.

This time, the second query is:

UPDATE dbo.Test2
SET id = 9
WHERE id = 5000009;


This time, the row with an ID of 9 is not shown in the results, because the update moved it from the end of the index to the beginning of the index after the SELECT had read the beginning of the index but before it had reached the end of the index.


The sum shows the correct results because the update had completed before the second select started.

A Smaller Demonstration

In the examples shown there were five million rows, but what about smaller tables?

Consider a far smaller table, which is accessed by three processes. One is simply reading the whole table (with READ COMMITTED Isolation Level). At the same time a second process has locked a row, because it is updating it. In addition, a third process is updating data and in the process is moving the row to another position within the index.

Firstly – the data before it is interfered with shows one thousand rows that SUM to 500500:


In a separate tab run the following command:


UPDATE dbo.Test2
SET Test_Value = 1000
WHERE id = 50;

Notice that there is no ‘COMMIT’ or ‘ROLLBACK’. This will lock row 50.

Now run the initial command, with the two SELECT statements. This will be blocked by the update, because Row 50 is locked. READ UNCOMMITTED would use this (as yet uncommitted) Test_Value and continue. READ COMMITTED will not, and is therefore blocked.

In a third window:


UPDATE dbo.Test2
SET ID = 1001
WHERE ID = 10;


This moved Row 10, which had already been read, and placed it at the end of the index.

Now ROLLBACK the update to row 50, this will allow the blocked process to continue.
The SELECT command has now completed, but the data for row 10 now also appears as row 1001 – the same data shown twice:



Of course, if the SELECT was also running a SUM or suchlike, the results would be incorrect in each of these examples.

So what is happening?

READ COMMITED obtains shared locks on the rows or pages as it requires them, but releases these locks as soon as it has read the locked data.

Therefore, it is possible to update, delete or insert rows where the query has already processed that area of the table, and to do the same in the part of the table that the query has yet to process. What is guaranteed with READ COMMITTED is that you can’t have dirty reads, but that’s pretty much it.

The fact that I placed the two queries within a Transaction also made no difference. The second query always saw the ‘true’ picture of the data because the manipulations had completed by that point.

I’ve frequently seen statements where people have claimed that to get a ‘true’ picture of the data that the default READ COMMITTED isolation level should be used. This clearly is a misunderstanding of how this process works, and one that I had to do a little bit of work to comprehend too.


Get every new post delivered to your Inbox.