Archive | SQL RSS feed for this section

Dynamic Data Masking – Basic Examples

26 Nov

Introduction

As part of my research into the various methods of protecting sensitive data, I’ve taken a look at dynamic data masking. A short demonstration and some observations are in the following article.

Background

One aspect of protecting data is ensuring that it is only seen by those that should be able to see it – people only have access to the information that they need, according to pre-defined security levels. This has become a far more prominent issue with the advent of GDPR, although it should have always been a major focus of any organisation that stores or handles personally identifiable information (PII).
Microsoft introduced dynamic data masking with SQL Server 2016, which introduces a method of masking data for those accounts that do not have the required permissions.

Test Environment

Create a simple test database, and a database user that is a member of db_datareader.

Listing 1: Create Initial Test Database and User

CREATE DATABASE DDM_Test;
GO

USE DDM_Test;
GO

CREATE USER TestUser WITHOUT LOGIN;

ALTER ROLE [db_datareader] ADD MEMBER [TestUser]
GO

Next, two tables are to be created. These tables are deliberately rather basic, without indexes or foreign key constraints – just concentrating on the required pieces for this article.

Listing 2: Create Test Tables

CREATE TABLE Characters(
ID              INT IDENTITY(1,1) NOT NULL,
Forename        VARCHAR(50) MASKED WITH (FUNCTION = 'partial(1,"!!",0)') NOT NULL,
MiddleName      VARCHAR(50) MASKED WITH (FUNCTION = 'default()') NULL,
Surname         VARCHAR(50) MASKED WITH (FUNCTION = 'partial(2,"XYZ",1)') NULL,
Interests       INT  MASKED WITH (FUNCTION = 'random(20,30)') NULL,
Email           VARCHAR(50) MASKED WITH (FUNCTION = 'email()') NULL,
Phone           VARCHAR(30) MASKED WITH (FUNCTION = 'default()') NULL,
DOB             Date	MASKED WITH (FUNCTION = 'default()') NULL,
Salary          INT     MASKED WITH (FUNCTION = 'default()') NULL,
AdditionalNotes	VARCHAR(200) NULL,
Age AS (DATEDIFF(year, [DOB], Getdate()))
);
GO

CREATE TABLE Interests(
ID      INT MASKED WITH (FUNCTION = 'default()') IDENTITY(1,1) NOT NULL,
LIKES   VARCHAR(200) NOT NULL
);
GO

The two tables have a MASKED WITH option, against those columns that I want to ‘disguise’ from those user accounts that do have the required permission. At this moment, there are four functions that can be used to mask data:

Partial – replaces the column data with the characters supplied in the second parameter (“!!” for Forename and “XYZ” for Surname), displaying the actual data for the specified number of characters from the start of the column data and the end of the column data, as specified in the first and third parameters. So, the Forename column will display the first character of the true data, then the characters “!!” for the rest of the data – because the third parameter is zero.
Default – replaces the column data dependent upon the type of data column – “XXXX” for character type columns, the number zero for numeric columns, ASCII zero for binary columns and 01-10-1900 00:00:0000000 for date and time columns.
Email – Shows the first character of the email address and uses ‘.com’ as the suffix, with “XXX@XXXX” for the remainder of the address.
Random – Displays a random number from the supplied from/to range. So, the ‘Interests’ column will display a random number from 10 to 30.

Now we add the data to these tables:

Listing 3: Insert the test data

INSERT INTO [dbo].[Interests]([LIKES])
VALUES ('Carrots'),
       ('Money'),
       ('Hunting');
GO

INSERT INTO [dbo].[Characters]([Forename], [MiddleName], [Surname], [Interests], [Email], [Phone], [DOB], [Salary], [AdditionalNotes])
VALUES ('Bugs', NULL, 'Bunny', 1, 'Bugs@someemail.com', '555-001-002-003', '20 Apr 1938', 240000,NULL), 
       ('Daffy',NULL,'Duck',2,'Daffy@someotheremail.com','555-123-456 Ext 27', '17 Apr 1937', 200000, NULL), 
       ('Elmer','J','Fudd',3,'EJF@HumtingForever.com','555-1212-3434','17 Jul 1937', 160000, 'Likes everybody to be very quiet'),
       ('Wile', 'E', 'Coyote', 3, 'SuperGenius@acme.com','555-565656-345','17 Sep 1949', 185000, 'Astronomical health insurance');

GO

Querying the Data

At this point in time, because I am a member of the sysadmin role, the masking is not evident. If I run a query against this data I can see all of the data in all of the columns:

Listing 4: Query the test data as sysadmin

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

Figure 1: Result from Listing 4

Query result as sysadmin

If I now switch to the TestUser user, which does not the required permission to view masked data, the impact of dynamic data masking becomes apparent.

Listing 5: Query the test data as TestUser

EXECUTE AS USER = 'TestUser';

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

REVERT

Figure 2: Result from Listing 5

Query result as TestUser

Now we can see the impact of the mask functions.

It is important to note that the masking only shows itself at the presentation of the data. The join still works, even though it looks like the column ‘Id’ of the table ‘Interests’ is always zero. It isn’t – it is only shown as that to those that are not permitted to see the original data.

Note that the ‘Age’ column is masked, even though a mask was not specified. Because one of the elements of the computed column is using a mask (‘DOB’), the value displayed by the computed column will be masked too.

Removing a Mask from a Column
To remove a mask completely from a column, there is a DROP MASKED option with ALTER COLUMN.

Listing 6: Removing the mask from a column

EXECUTE AS USER = 'TestUser';

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

REVERT;

ALTER TABLE Characters   
ALTER COLUMN [Forename] DROP MASKED;
GO

EXECUTE AS USER = 'TestUser';

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

REVERT;

Figure 3: Result from Listing 6 – showing un-masked forename column

Before and After un-masking column

Granting Permission to View Masked Data

To give permission for a user to view masked data the UNMASK permission is required.

Listing 7: Assigning Permission to View Masked Data

EXECUTE AS USER = 'TestUser';

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

REVERT;

GRANT UNMASK TO TestUser;
GO

EXECUTE AS USER = 'TestUser';

SELECT *
FROM Characters c
LEFT OUTER JOIN Interests i on i.ID = c.Interests;

REVERT;

Figure 4: Result from Listing 7

Before/After UNMASK Permission

Using SELECT/INTO

Even if a user does not have UNMASK permission, if they are able to execute SELECT/INTO and create a new table, they are able to extract the data and view it but it will be stored in that table as the masked values – the original data will not be written to that table unless the user has UNMASK permission. When a table is created via SELECT/INTO, the MASK properties for the columns are not copied across to the new table.

Listing 8: Restricted User Copying Data to New Table

REVOKE UNMASK TO TestUser;
GO
 
GRANT CREATE TABLE TO TestUser;
GO
 
GRANT ALTER ON SCHEMA::dbo TO TestUser;
 
EXECUTE AS USER = 'TestUser';
 
SELECT *
INTO dbo.CharactersNoMaskCopy FROM
[dbo].[Characters];
 
SELECT *
FROM [dbo].[CharactersNoMaskCopy];
 
REVERT;

Figure 5: TestUser View of Copied Data

Values written to new SELECT/INTO target

Figure 6: New table without mask properties

SELECT/INTO target without mask

Execution Plan

There does appear to be an overhead, detailed within the execution plan. Although this small demonstration will not show a significant difference, it may be worth further investigation, if you are concerned about any performance impact from applying masked data columns.
In this project,several

Figure 6: Query Plan Expressions – no Data Masking

Query Plan Expressions Without Masking

Figure 7: Query Plan Expressions – Data Masking

Query Plan Expressions With Data Masking

Conclusion

Dynamic data masking creates a simple and easy method of protecting data from unauthorised viewing but it is rather basic. It does not mask the data within the database itself, only when the data is presented (i.e. from the result of a SELECT statement). Therefore, the data is still vulnerable – from copying the backups or suchlike.

References
Dynamic Data Masking
GDPR

Column Set – Basic Examples

19 Nov

A column set is an untyped XML column that can be used to update and select all sparse columns defined in the associated table. This XML is not physically stored in the table – it is in effect a calculated column that can be used to update.
Microsoft recommend its use for tables that contain a large number of sparse columns, although it is not without its overheads.
It has some interesting effects upon updates and selects, for the table that uses it.

Test Environment

A default database is created, with a basic test table within:

Listing 1: Create Initial Test Environment

CREATE Database Demo1;
GO

USE Demo1;
GO


CREATE TABLE Demo1(
	ID               INT IDENTITY(1,1)  NOT NULL,
	ProductName      VARCHAR(50)        SPARSE,
	[Description]    VARCHAR(100)       SPARSE,
	InStock          BIT                SPARSE,
	DateCreated      Date               SPARSE,
	CSet             XML COLUMN_SET FOR ALL_SPARSE_COLUMNS
	);
GO

Apart from the ‘ID’ column, all other columns have the SPARSE property.
The column ‘CSet’ is the column set – an XML column that has ‘FOR ALL_SPARSE_COLUMNS’ specified. At the time of writing this article ‘ALL_SPARSE_COLUMNS’ is the only option available for a column set definition.

Inserting data

Data can be inserted into this table by inserting the individual column data, or by inserting an XML value via the ‘CSet’ column set:

Listing 2: Inserting Rows

INSERT INTO Demo1([ProductName], [Description], [InStock], [DateCreated])
VALUES('Test Product1','Test Product for no particular reason',0,'01 Feb 2018');
GO

INSERT INTO Demo1(Cset)
VALUES('<ProductName>Test Product2</ProductName><Description>Another Test Product</Description><InStock>0</InStock><DateCreated>2018-02-25</DateCreated>');
GO

Figure 1: Result from Listing 2

Column set data inserted

Notice in the example above, the column ‘CSet’ is always populated – even if the data was not inserted via that column. This XML value will contain the values of all non-null sparse columns.

What you cannot do, is a combination of both – you can’t insert directly into one sparse column and into the other sparse columns via the column set:

Listing 3: Incorrect Insert

INSERT INTO Demo1(ProductName, CSet)
VALUES('Test Product 3','<InStock>0</InStock><DateCreated>2018-06-10</DateCreated>');
GO

This results in the error message “The target column list of an INSERT, UPDATE, or MERGE statement cannot contain both a sparse column and the column set that contains the sparse column. Rewrite the statement to include either the sparse column or the column set, but not both.”

And if you don’t include a column, it will of course default to NULL:
Listing 4: Omitting a column (Description)

INSERT INTO Demo1(Cset)
VALUES('<ProductName>Test Product3</ProductName><InStock>0</InStock><DateCreated>2018-06-10</DateCreated>');
GO

Figure 2: Result from Listing 4

‘Description’ is NULL

SELECT *

The behavior of ‘SELECT *’ changes, where a table has a column set. It will not return all columns but just the non-sparse columns and the column set value. This may be important if you expect all columns to be returned.
This can also been seen when using SQL Server Management Studio (SSMS) to return the top rows, from the right-click option against the table.

Figure 3: SELECT *

SELECT * result

UPDATE via column set

A column set can be used to update a row but every sparse column that is required to have a non-null value must be specified. If a column has an existing value and it needs to keep that after the update, it must be included in the column set.

Listing 5: Updating via column set:

SELECT [ID], [ProductName], [Description], [InStock], [DateCreated], [CSet]
FROM [dbo].[Demo1];

  UPDATE dbo.Demo1
  SET CSet = '<Description>Updated Description for Test Product3</Description>'
  WHERE ProductName = 'Test Product3';

SELECT [ID], [ProductName], [Description], [InStock], [DateCreated], [CSet]
FROM [dbo].[Demo1];

Figure 4: Result from Listing 5

Column data removed that was not in column set

Adding a column set to an existing table

If you need to add a column set to an existing table, the table must be dropped and re-created with the column set. It is not possible to add a column set to an existing table.

References

Use Column Sets
Use Sparse Columns

How to Script a Certificate

29 May

In a previous article I described how to use a certificate to sign a stored procedure. Part of that article showed how to copy the certificate between databases by using a backup/restore method. This article shows another method – scripting. Continue reading

Using a Certificate to Sign a Stored Procedure

24 Apr

And how to assign Server-Level permissions to a database user.

Background

In a previous article I gave an example of assigning permissions to a user and making use of the EXECUTE AS instruction. This is a very useful way of assigning database permissions in a very granular way, where a user should not have the associated role for those permissions.
However, if the permission required relates to a server-level role then another method is required. Continue reading

Using EXECUTE AS to control data access

17 Apr

Introduction

Ownership chaining is one process that SQL Server uses to allow stored procedures access to tables where the user might not have permission. It has issues with stored procedures that have dynamic sql – EXECUTE AS is one solution. Continue reading

Automatic Data Purging With Partition Truncation

21 Feb

Some years ago a wrote an article about automatic data purging with partition switching. Having moved to SQL Server 2016 recently I have encountered the new ability to truncate individual partitions, rather than switching data out.
Continue reading

Shredding JSON Nested Arrays

5 Feb

Whilst reading up on SQL Server 2016 JSON functionality I have seen many examples of extracting data from a JSON array. However, I wanted to work out how to extract data from an array within an array – for example, an array of customer data where each customer has an array of order details.
Continue reading