Notes on SQL

Random articles from a puzzled DBA

A heap is a table that has no clustered index – it may have non-clustered indexes but without a clustered index it is a heap.
When a row is updated within the heap, if the new data size means a row cannot be stored in the original location, SQL Server moves the row to another location and leaves a pointer (called a forwarding record) to show where it has been moved to.


Test Data Construction

Listing 1: Basic table and test data

CREATE TABLE HeapTest(
	HeapTestID		INT IDENTITY(1,1),
	Col2			CHAR(2000),
	Col3			VARCHAR(1000)
);
GO


INSERT INTO HeapTest(Col2, Col3)
VALUES (REPLICATE('1',2000),''),
       (REPLICATE('2',2000),''),
	   (REPLICATE('3',2000),''),
	   (REPLICATE('4',2000),'');
GO

SET STATISTICS IO ON;

SELECT *
FROM HeapTest;

The listing above will create 4 rows of data that fill a page.
Heap_01

Examining the initial data

In the above example I have set STATISTICS IO ON, which shows that only one logical read was required to retrieve the result for the query:
Table ‘HeapTest’. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Looking at the table details via the Dynamic Management View (DMV) dm_db_index_physical_stats, shows that there are currently no forwarded records.

Listing 2: Show Forwarding records count

---Show the number of forwarding records
SELECT index_type_desc,
	   alloc_unit_type_desc,
	   record_count,
	   forwarded_record_count
FROM sys.dm_db_index_physical_stats
(DB_ID('TestDB'),OBJECT_ID('HeapTest'),NULL,NULL,'DETAILED');

Heap_02

To find the page number that this table’s data has been written, use DBCC IND and refer to the PageFID and PagePID columns where PageType is 1.

Listing 3: DBCC IND

DBCC IND (TestDB, HeapTest, -1); ---Use the FileID and PageID where PageType = 1

Heap_03

With the file and page number retrieved from the DBCC IND query it is then possible to look at the data as it is stored on the page. This requires setting trace flag 3604 and using DBCC PAGE, with the relevant file and page number.
The screen-shot below shows a section of the output from DBCC IND, with the row with all ‘2’ following the row with all ‘1’.

Listing 4: DBCC PAGE

DBCC TRACEON(3604);
GO
DBCC PAGE(TestDB,1,413,3);
GO

Heap_04

Updating the data

Adding data to the VARCHAR column of the row with HelpTestID of 2 increases the size of that row to the point where it cannot be written back to the same location.

Listing 5: Update one row

UPDATE dbo.HeapTest
SET Col3 = REPLICATE('A',1000)
WHERE HeapTestID = 2;
GO

SET STATISTICS IO ON;

SELECT *
FROM HeapTest;

Heap_05

Checking the updated table

Now that the data has had to be moved elsewhere, this has also had an impact on the number of logical reads:
Table ‘HeapTest’. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Checking the results of the DMV again, this shows an increased record count and a forwarded record.

Listing 6: DMV after update

---Show the number of forwarding records
SELECT index_type_desc,
	   alloc_unit_type_desc,
	   record_count,
	   forwarded_record_count
FROM sys.dm_db_index_physical_stats
(DB_ID('TestDB'),OBJECT_ID('HeapTest'),NULL,NULL,'DETAILED');

Heap_06

Executing the DBCC command again, shows a FORWARDING_STUB entry, as well as the location that the data has been ‘forwarded’ to. The forwarding entry explains the addition to the ‘record_count’ column of the DMV.

Listing 7: DBCC PAGE after update

DBCC TRACEON(3604);
GO
DBCC PAGE(TestDB,1,413,3);
GO

Heap_07

And using this data to look at File 1, Page 415, shows the forwarded data, with a record type of FORWARDED_RECORD.

Heap_08

The impact of forwarded data

This forwarded record means that SQL Server now has to perform additional reads to retrieve the rows from the table.
The row that has been forwarded will actually be read when the FORWARDING_STUB is encountered and ignored when the scan encounters the later FORWARDED_RECORD entry. This is to prevent the same data being read twice. Of course, it then needs to return to where it was in the scan when it encountered this forwarding record, so it can continue to read the rest of the rows. It is this ‘hopping’ that causes the additional reads.

De-fragmenting a heap

Interestingly, MSDN states that to remove the fragmentation “To rebuild a heap to reclaim wasted space, create a clustered index on the heap, and then drop that clustered index.”. However, Paul Randal has his thoughts on such a thing Fixing Heap Fragmentation

Conclusion

In this article I used a very small heap table to show the inner working of forwarded records. With such a small table fragmentation isn’t really an issue. Larger heap tables would normally have non-clustered indexes and there is usually a case for a clustered index anyway. If you have a large heap table where fragmentation is causing you a problem, you might not have the right table structure.

Further reading

Forwarding and forwarded record, the back-pointer size
MSDN – Heaps

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: