Archive | December, 2013

XML and STUFF for a simple CSV column

27 Dec

I’m not a big fan of XML via T-SQL – far too long-winded a monster for my liking. However, there is no denying the usefulness of it and the fact that on occasions it simply doesn’t matter if you like an aspect of SQL Server, you may simply have to learn to use it. If I could pick and choose which bits I get to play with I suspect I wouldn’t be a DBA for long.

One nice little ‘trick’ with the XML commands that I’m grateful for is used to provide a comma-separated list of data. It’s an interesting solution that is often shown to people who need such a thing and is far faster than other solutions I’ve seen.
As an example, I would like to produce a csv list of the surnames from the following data:

Data01

To obtain the following results:

Data02

First of all – the test data:

CREATE TABLE #PersonList( 
		PersonID	INT IDENTITY, 
		Forename	VARCHAR (50), 
		Surname		VARCHAR(50) 
		); 
GO 
 
INSERT INTO #PersonList ( Forename, Surname ) 
VALUES  ( 'John', 'Smith'), 
		( 'Tom', 'Brown'), 
		( 'Tom', 'Sawyer'), 
		( 'Harry', 'Flashman'), 
		( 'Anna', 'Howe'), 
		( 'Julian', 'Sorel'), 
		('Sherlock', 'Holmes'); 
GO 
 
SELECT * 
FROM #PersonList; 

The command to produce the csv data is:

SELECT STUFF((SELECT ', ' + Surname  
                            FROM #PersonList 
		          ORDER BY Surname 
                            FOR XML PATH('')), 1, 2, ''); 

It comprises of two commands. The inner command is FOR XML in it’s simplest form, with a comma added to the selected data:

SELECT ', ' +  Surname 
           FROM  #PersonList 
           ORDER BY Surname 
           OR XML PATH(''); 

This is a very simple example of the XML command. Normally the PATH would have something within it and would therefore produce a very basic XML.
For example:

SELECT Surname  
         FROM  #PersonList 
         FOR XML PATH('test'); 

will produce XML output thus:

XML01

Not valid XML in this form (hence the red underline) but you get the idea.

Change the command to add a comma between each surname and remove the element name (the ‘test’) and you get a strange XML value like this:

XML02

The next requirement is to remove the leading comma and convert this back to a character field. This is where STUFF comes in. In this case the first parameter for STUFF is the XML query.
Then it looks at the first character position of the query result and replaces the two characters from that point with a zero length field (”).
As STUFF returns character data or binary data depending upon the data type it is processing, in this case it returns the result as a character field, with the leading comma removed.
Two problems solved with one command.

I’ve added the ORDER BY just to make the data look more organised and of course the ‘inner’ XML query can be more involved than this one but it should give an idea of how to use this for this particular purpose:

SELECT STUFF((SELECT ', ' + Surname 
                            FROM #PersonList 
		          ORDER BY Surname 
                            FOR XML PATH('')), 1, 2, ''); 

UnPivot – Basic Example

23 Dec

In a previous posting I described a simple example of Pivot – a much used feature which I personally have problems recalling the syntax of (hence the article in the first place). It’s lesser known relative in UnPivot, which is designed to do the opposite – it transforms columns of data into rows.

In the following example I have a few details about properties – the number of Bedrooms, Garages, Gardens and Attic Spaces. I’ve used this as an example because the column names will make it more apparent how the query and the results are connected.

I want to transform the data from showing the property features as column names to actually have those column names listed as data, along with the values that these columns contain:

Unpivot01

First of all – the test data:

CREATE TABLE #Properties 
    ( 
      PropertyID INT , 
      Attic_Spaces INT , 
      Bedrooms INT , 
      Garages INT , 
      Gardens INT 
    ) ; 
GO 

INSERT  INTO #Properties 
        ( PropertyID, Attic_Spaces, Bedrooms, Garages, Gardens ) 
VALUES  ( 101, 1, 2, 0, 1 ), 
        ( 102, 2, 1, 1, 2 ), 
        ( 103, 1, 2, 1, 2 ), 
        ( 104, 1, 3, 2, 2 ) ; 
GO 

SELECT  * 
FROM    #Properties ;   

The basic construct of an UnPivot is quite simple:

SELECT PropertyID, 
	   PropertyFeature, 
	   NumberOf 
	   FROM #Properties p 
	   UNPIVOT ( NumberOf FOR PropertyFeature IN ( Attic_Spaces, Bedrooms, Garages, Gardens ) ) AS unpvt ; 

This will select PropertyID from the #Properties table. Then it will select the columns ‘PropertyFeature’ and ‘NumberOf’ from the UNPIVOT part of the query. In this section I have asked it to create a column called ‘PropertyFeature’, which is taken from the columns called ‘Attic_Spaces’, ‘Bedrooms’, ‘Garages’ and ‘Gardens’. The value that is held within those ‘original’ columns (within the #Properties table) will be stored in the column called ‘NumberOf’ within the UNPIVOT section.

The alias for the #Properties table isn’t required, it’s just a habit of mine. However, the alias for the UNPIVOT (the ‘AS unpvt’) is mandatory even though you see no reference to it.

This may not be the most accurate description of what it is really doing, but it is how my brain gets around the syntax and that’s good enough for me, for now.

Snapshot Backup – Basic Example

12 Dec

A couple of questions recently in SQL Server Central have related to Snapshot Backups and reminded me of an article I wrote a few years ago about such things for the Developers in my office.
So, using that as a basis for this article I’m going to explain the basics of Snapshot Backups.

In the past I’ve had to run tests against some databases that would take upwards of 30 minutes to restore whenever I wanted to repeat those tests. This is a waste of my time and resources as it also brought my PC to a halt whilst running. Of course, you can take copies of individual tables if you’re only interfering with a small number and move the data back to repeat a test. However, there are occasions where there are numerous tables impacted and it just isn’t a realistic option.
This is where you can take a Snapshot. A Snapshot is effectively a read-only copy of the database, generated by SQL Server as the data is changed. You can run SELECT statements against a Snapshot Database as if it were the original – any data altered within the originating database is within the Snapshot in its original state and if the data you want has not been altered then SQL Server will look in the originating database; your query just needs to look at the Snapshot Database without caring if the data will actually be held in there or not.
To create a Database Snapshot, you need to know the actual filenames that the database is made from. The easiest way is to use SQL Server Management Studio, right-click on the database and look at the properties. We’re after the ‘file’ information:

Snapshot06

This particular database has two files for storing the data – ‘SnapshotDemo’ and ‘SecondaryData’.
To create a snapshot, you need to have a folder where the Snapshot data will be stored and then tell SQL Server to store the data from the two files detailed above into files within this folder (it will create the files but the folder needs to exist beforehand):

CREATE DATABASE SnapshotDemo_Snapshot01 ON
(NAME = SnapshotDemo, FILENAME = 'C:\Demo Snapshots\SnapshotDemo01.SS'),
(NAME = SecondaryData, FILENAME = 'C:\Demo Snapshots\SecondaryData01.SS')
AS SNAPSHOT OF SnapshotDemo; 

I’ve created the Snapshot files based on the names of the original files. The ‘SS’ extension is the usual extension used for snapshot data but neither of these matter – you can call the files whatever you like.
Once the command runs I now have a Snapshot Database:

Snapshot02

Before I change any data, both are essentially identical databases. The difference is that the Snapshot is read-only. Running a SELECT against both databases gives the same results:

Snapshot03

As an aside, the data was generated by Redgate’s SQL Data Generator – a wonderful tool if you ever get the opportunity to grab it.

Now I run a DELETE against the originating database – removing the rows that have ‘ACTIVE’ set to 0 and repeat the previous query:

Snapshot04

Now the data in ‘SnapshotDemo’ shows the data as it is after the DELETE but ‘SnapshotDemo_Snapshot01’ shows the data as it was before the change.

I can now make whatever changes I want to the data within the originating database and SQL Server will track the alterations, keeping the original data within the Snapshot.
When I need to restore the data I don’t use the ‘usual’ restore. Instead I restore the Snapshot:

RESTORE DATABASE SnapshotDemo
FROM DATABASE_SNAPSHOT = 'SnapshotDemo_Snapshot01';

For a large database this can take a fraction of the time a ‘normal’ restore requires. The restore time varies depending upon how much data you have been playing with but the important thing is that it is just the altered data – it doesn’t restore the entire database (unless you’ve changed all data within the entire database of course).
Now repeat the SELECT statement from earlier:

Snapshot05

So the database has been restored back to the state it was in when the snapshot was taken and I can play all over again.

In a future article I’ll look more closely at what a Snapshot actually does behind the scenes but this basic example and explanation should give an idea of the usefulness of this somewhat underused facility.