SQL Server Triggers and Multiple Rows

March 4, 2011 09:51 by wjchristenson2

A common mistake that I’ve seen is the assumption that a DML trigger is fired once for each record modified.  This is not true.  A good SQL developer will use set-based DML statements to alter multiple records at a time.  Regardless of how many records may have been changed, the table’s DML trigger will fire only once.  The trigger’s INSERTED & DELETED tables will contain all the records affected.  I don’t like triggers, but if you find yourself in a pickle and need one… make sure you code for the possibility that more than one record was changed.

 

BAD trigger:
CREATE TRIGGER trgTableA ON TableA FOR UPDATE
AS
	DECLARE @ID int, @name varchar(50);

	SELECT @ID = i.ID, @name = i.Name FROM inserted i;

	UPDATE TableB SET Name = @name WHERE ID = @ID;
GO

 

GOOD trigger:
CREATE TRIGGER trgTableA ON TableA FOR UPDATE
AS
	UPDATE b SET 
		b.Name = i.Name
	FROM TableB b
		INNER JOIN inserted i ON i.ID = b.ID;
GO
Bookmark and Share

ADO.NET and SQL Server 2008 User-Defined Table Types

February 12, 2011 04:22 by wjchristenson2

I recently came across a situation where I needed to batch insert, update, and delete 1 to 100,000+ records to SQL Server at any given time.  Traditionally I would generate an xml string and parse it to a SQL Server table variable and do batch set operations from there.  However I heard about SQL Server 2008’s new UDTTs (user-defined table types) which I could pass a DataTable to a stored procedure as TVPs (table-valued parameters).  It sounded very attractive as I wouldn’t have to generate & parse xml strings to table variables in SQL Server.  I also thought it would give me huge performance gains.  I quickly found that UDTTs were not my saving grace after all and were less than impressive in my opinion.

Before I show examples on how to use UDTTs/TVPs from ADO.NET, I’ll give you my takeaways as to why I chose not to use them:

1) They do not scale.  If you profile SQL Server you’ll notice that for each record in your DataTable passed via ADO.NET, SQL Server will essentially create a table variable perform an insert for each record.  Imagine 10,000 inserts into a table variable before you can begin batch set operations… not a good thing.

2) “sp_executesql” does not play well with TVPs from ADO.NET.  I was in a situation where I had to dynamically generate SQL based on the schema of the table.  Therefore I wanted to cache the execution plan with “sp_executesql” as the schema would not change much.  However if you try and use “sp_executesql” with a SqlCommand with the CommandType as a stored procedure and are passing in a UDTT as a SqlParameter, it will fail.  Error:  Could not find stored procedure 'sp_executesql'.  This error only occurs if you try and pass a TVP from ADO.NET.  If you want to pass a TVP to “sp_executesql” with ADO.NET, you have to get creative with having the SqlCommand command’s type be text and use “sp_executesql” that way.

3) SQL Server compiles the execution plan every time.  You can see more info here and here.  If you use SQL Server Management Studio and pass TVPs to a stored procedure, no SQL compilations are performed.  However if you pass a TVP from ADO.NET, compilations will occur every time.  I won’t go into great detail over it as you can read the 2 links above.  However if you have a stored procedure which will be called a lot, you may want to consider this fact.

4) When passing a DataTable as a TVP from ADO.NET, the DataTable must have the exact same number of columns in the exact same order and the same compatible data types.  I can understand the data types, however there are times where I have a DataTable that may be used for other programmatic purposes outside of TVP/SQL operations. If I want to pass it as a TVP to SQL Server, I essentially have to make a new DataTable and refill it perfectly so the column count is the same and in the same order.  Very finicky and requires coordination and communication if anyone changes the UDTT as it will break code easily.

There’s my 4 reasons why I was not impressed with UDTTs/TVPs.  One nice thing is they are very simple to pass via ADO.NET and easy to use within a stored procedure once you set things up correctly.  Below is a quick example how to wire things up if you wish to use TVPs with ADO.NET.

 

Create the UDTT

IF EXISTS (SELECT * FROM sys.types st JOIN sys.schemas ss ON st.schema_id = ss.schema_id WHERE st.name = N'udtt_PersonNames' AND ss.name = N'dbo')
DROP TYPE [dbo].[udtt_PersonNames];
GO

CREATE TYPE [dbo].[udtt_PersonNames] AS TABLE(
	ID [int] NOT NULL,
	[Name] [varchar](255) NOT NULL,
	UNIQUE CLUSTERED ([ID] ASC) WITH (IGNORE_DUP_KEY = OFF)
);
GO

 

Create the Stored Procedure accepting the TVP as a UDTT

IF EXISTS (SELECT * FROM sysobjects WHERE type = 'P' AND name = 'usp_InsertPersonNames')
	BEGIN
		DROP PROCEDURE dbo.usp_InsertPersonNames
	END
GO

CREATE PROCEDURE dbo.usp_InsertPersonNames (
	@PersonNamesDT AS dbo.udtt_PersonNames READONLY
)
AS
	SET NOCOUNT ON;
	
	INSERT INTO dbo.PersonNames([ID],[Name])
	SELECT [ID], [Name] FROM @PersonNamesDT;

	SET NOCOUNT OFF;
GO

 

Create an ADO.NET DataTable (same # of columns and order as the UDTT)

        Dim dt As DataTable = New DataTable()
        dt.Columns.Add(New DataColumn("ID", GetType(Integer)))
        dt.Columns.Add(New DataColumn("Name", GetType(String)))

        dt.Rows.Add(New Object() {1, "Joe"})
        dt.Rows.Add(New Object() {2, "Mark"})
        dt.Rows.Add(New Object() {3, "Ted"})

 

Pass the ADO.NET DataTable as a TVP to the Stored Procedure

        Using conn As SqlConnection = New SqlConnection("Data Source=localhost\sqlexpress;Initial Catalog=UDTT_test;Integrated Security=True;")
            conn.Open()

            Dim cmd As SqlCommand = New SqlCommand()
            With cmd
                .Connection = conn
                .CommandType = CommandType.StoredProcedure
                .CommandText = "dbo.usp_InsertPersonNames"

                Dim tvParam As SqlParameter = New SqlParameter("@PersonNamesDT", dt)
                tvParam.SqlDbType = SqlDbType.Structured
                tvParam.TypeName = "udtt_PersonNames"
                .Parameters.Add(tvParam)

                .ExecuteNonQuery()
            End With
        End Using

UserDefinedTableTypes_Soln.zip (82.82 kb)
 

Bookmark and Share

Primary Keys and Distributed Environments

September 22, 2010 14:24 by wjchristenson2

In architecting a new application or enhancing an existing one, important considerations need to be made when choosing how to implement primary keys/IDs.  From my own personal research and experience; unique, clustered, auto-incrementing, integer identity columns as the PK perform the best when it comes to DB performance.  They result in more compact clustered indexes, less IO when the table is queried, and they yield more efficient joins.  However, they have inherent problems when used in a distributed environment.

 

Auto-Increment (Identity Columns)
Auto-Increment columns on a SQL Server have seed and and increment properties.  Because these properties are fixed and can be selected from a finite number of possible values, the probability that you’ll get a PK collision is very high.  Therefore an auto-incrementing integer identity column should only be used in non-distributed/download-only situations.  No other server would be creating PK values and thus, you would not have PK conflicts.

There is one way where you could use this type of PK.  Each node which generates PKs could be assigned an identity range to avoid conflicts.  I can see some negatives to this solution: 
1) Have to manage nodes/ranges
2) Page splitting
3) Increased IO on new record inserts as the next identity value isn’t necessarily the the max
4) Loss of PK value possibilities as ranges may not be fully utilized

 

GUIDs - NEWID()
Using a GUID or UNIQUEIDENTIFIER in SQL Server is attractive because the generation algorithm provides a sufficiently high degree of probability that the same key would never be generated twice on any 2 nodes at any given time.

Although GUIDs meet the uniqueness required of a distributed environment, they have some glaring drawbacks that should be considered:
1) Huge in size (4x larger than an integer / 16 bytes). 
2) Large size negatively affects indexes.
3) Because they are random, they would be inserted into random locations within the clustered index resulting in fragmented clustered indexes which yields poor IO performance when the table is queried.
4) Generation of a new GUID in SQL Server does cost more resources than auto-incrementing an integer.  When bulk inserting, this has some glaring performance problems.
5) NEWID() uses the the server’s network card’s ID number plus a unique number from the CPU clock to generate the GUID.  The server would need to have a NIC to ensure GUID uniqueness.

 

GUIDs – NEWSEQUENTIALID()
The answer to the randomness problems with generating a new GUID is to generate sequential GUIDs.  SQL Server has the NEWSEQUENTIALID() function which will generate a GUID that will be greater in value than the previously generated one.  This solves the problems of fragmented clustered indexes.  It also solves the performance problems with generating a new GUID.  Because the new “sequential” GUID takes the previous GUID and increments it per se, the overhead is minimal and performs almost as well as an auto-incremented integer.

So what are the drawbacks to a sequential GUID?
1) Still Huge in size (4x larger than an integer / 16 bytes).
2) Large size negatively affects indexes.
3) NEWSEQUENTIALID() uses the the server’s network card’s MAC address to assist in the GUID generation process.  If privacy is a concern, then this probably isn’t a viable option for you.  Again, the server would need to have a NIC to ensure GUID uniqueness.

 

Natural Keys
In this strategy, you use business keys to uniquely identify records (e.g. social security number for a person).  The main drawbacks with this approach is:
1) Many columns may be needed to create uniqueness of the record.
2) Inconsistent PK columns/data types.
3) Not all records may have a glaring natural key to setup a PK for.

 

Node Identifiers
For this approach, the PK will combine a value that is unique to the node and a value that is unique to the topology.  For instance, you could have a PK that combines an auto-increment identity column and a unique id for the node/server column.  You could also create a custom system to generate values for each inserted row based on the rowID/nodeID.

 

Summary
In summary, there are many ways to architect a distributed database environment in regard to primary keys.  Personally sequential GUIDs and node identifiers are my top 2 picks.  If I had to choose between the two, I think I’d opt for node identifiers.  GUID size, privacy concerns, and ease of use appear to be a bit too much in my opinion.

Bookmark and Share

SQL Server 2008 FILESTREAM and ADO.NET

August 6, 2010 06:45 by wjchristenson2

In my last post, I gave a tutorial of SQL Server 2008’s new FILESTREAM feature.  I described what it is, why it should be used for BLOB data storage, and released some TSQL scripts to enable FILESTREAM and create a table to hold some media.  In this post, I will show how to insert and read FILESTREAM data from a SQL 2008 Server using ADO.NET.

ADO.NET comes with the SqlClient data provider (System.Data.SqlClient).  It supports FILESTREAM operations by using the SqlFileStream class which resides in the System.Data.SqlTypes namespace.  As its name suggests, the SqlFileStream is a stream and thus inherits from System.IO.Stream.  The SqlFileStream will provide methods to allow us to read and write streams of data to the SQL Server.  I will show 2 examples: inserting and reading FILESTREAM data with ADO.NET (the SqlFileStream object).  The program I created allows the user to save and view .jpg images from the SQL Server using FILESTREAM.  Here is a picture of the final example program.


Overview:
The SqlFileStream requires 2 arguments which are supplied from the SQL Server: the read/write path to the server’s file and the transactional context/token.  These 2 items are acquired via a preliminary SQL query before the the SqlFileStream is employed to read or write to the NTFS file system.

Inserting FILESTREAM Data
I first INSERT a record with a NULL value for the varbinary(max) field.  I use this blank record to acquire the server file path and also return the current FILESTREAM transactional context/token.  After this is complete, I can employ the SqlFileStream to write to the server file from my local file.  Below is my stored proc I used along with the .NET code.

 

IF EXISTS (SELECT * FROM sysobjects WHERE type = 'P' AND name = 'usp_InsertDocument')
	BEGIN
		DROP PROCEDURE dbo.usp_InsertDocument
	END
GO

CREATE PROCEDURE dbo.usp_InsertDocument 
	WITH ENCRYPTION
AS
	SET NOCOUNT ON;

	DECLARE @DocumentID INT;

	--insert NULL first to acquire ID & the transaction context values
	INSERT INTO dbo.Document (Document) VALUES (0x00);

	SELECT @DocumentID = SCOPE_IDENTITY();

	SELECT 
		doc.DocumentID,
		doc.DocumentGUID,
		doc.Document.PathName(0) AS [PathName], 
		GET_FILESTREAM_TRANSACTION_CONTEXT() AS [TransactionContext] 
	FROM dbo.Document AS doc WITH (NOLOCK) 
	WHERE doc.DocumentID = @DocumentID;

	SET NOCOUNT OFF;
GO

 

    Private Sub InsertDocument(ByVal clientFilePath As String)
        Dim dt As DataTable = New DataTable()

        Using conn As SqlConnection = New SqlConnection(_connectionString)
            Dim reader As SqlDataReader = Nothing
            Dim context() As Byte = Nothing
            Dim serverFilePath As String = ""

            conn.Open()

            Dim transaction As SqlTransaction = conn.BeginTransaction()

            Try
                Dim cmd As SqlCommand = conn.CreateCommand()
                With cmd
                    .Transaction = transaction
                    .CommandType = CommandType.StoredProcedure
                    .CommandText = "dbo.usp_InsertDocument"
                End With

                reader = cmd.ExecuteReader()
                dt.Load(reader)

                If dt.Rows.Count > 0 Then
                    serverFilePath = dt.Rows(0)("PathName").ToString()
                    context = DirectCast(dt.Rows(0)("TransactionContext"), Byte())

                    Using serverStream As SqlFileStream = New SqlFileStream(serverFilePath, context, FileAccess.Write)
                        Using clientStream As FileStream = New FileStream(clientFilePath, FileMode.Open, FileAccess.Read)
                            Dim bytesRead As Integer = 0, bufferSize As Integer = 4096
                            Dim buffer As Byte() = New Byte(bufferSize - 1) {}

                            Do
                                bytesRead = clientStream.Read(buffer, 0, bufferSize)
                                If bytesRead = 0 Then
                                    Exit Do
                                End If

                                serverStream.Write(buffer, 0, bytesRead)
                                serverStream.Flush()
                            Loop While True

                            clientStream.Close()
                        End Using

                        serverStream.Close()
                    End Using
                End If

                transaction.Commit()
            Catch ex As Exception
                Try
                    'attempt to rollback the transaction
                    transaction.Rollback()
                Catch ex2 As Exception
                    'rollback was unsuccessful (connection may be closed)
                End Try

                MessageBox.Show(ex.Message)
            Finally
                If reader IsNot Nothing AndAlso Not reader.IsClosed Then reader.Close()
            End Try
        End Using

        LoadDocuments()
    End Sub

 

Reading FILESTREAM Data
Similar to the insertion of a file, reading a file requires getting the server file path and then current FILESTREAM transactional context/token.  After those items are gathered from the SQL Server, we can connect to and stream the data anyway we like.  Below I streamed .jpg data from the SQL Server and loaded them into a WPF Image control for viewing purposes.

 

IF EXISTS (SELECT * FROM sysobjects WHERE type = 'P' AND name = 'usp_SelectDocument')
	BEGIN
		DROP PROCEDURE dbo.usp_SelectDocument
	END
GO

CREATE PROCEDURE dbo.usp_SelectDocument (
	@DocumentID INT
)
	WITH ENCRYPTION
AS
	SET NOCOUNT ON;

	SELECT 
		doc.DocumentID,
		doc.DocumentGUID,
		doc.Document.PathName(0) AS [PathName], 
		GET_FILESTREAM_TRANSACTION_CONTEXT() AS [TransactionContext] 
	FROM dbo.Document AS doc WITH (NOLOCK)
	WHERE doc.DocumentID = @DocumentID;

	SET NOCOUNT OFF;
GO

 

    Private Sub LoadDocument(ByVal documentID As Integer)
        Dim dt As DataTable = New DataTable()

        Using conn As SqlConnection = New SqlConnection(_connectionString)
            Dim reader As SqlDataReader = Nothing
            Dim context() As Byte = Nothing
            Dim serverFilePath As String = ""

            conn.Open()

            Dim transaction As SqlTransaction = conn.BeginTransaction()

            Try
                Dim cmd As SqlCommand = conn.CreateCommand()
                With cmd
                    .Transaction = transaction
                    .CommandType = CommandType.StoredProcedure
                    .CommandText = "dbo.usp_SelectDocument"
                    .Parameters.Add(New SqlParameter("@DocumentID", documentID))
                End With

                reader = cmd.ExecuteReader()
                dt.Load(reader)

                If dt.Rows.Count > 0 Then
                    serverFilePath = dt.Rows(0)("PathName").ToString()
                    context = DirectCast(dt.Rows(0)("TransactionContext"), Byte())

                    Using memStream As MemoryStream = New MemoryStream()
                        Using serverStream As SqlFileStream = New SqlFileStream(serverFilePath, context, FileAccess.Read)

                            Dim bytesRead As Integer = 0, bufferSize As Integer = 4096
                            Dim buffer As Byte() = New Byte(bufferSize - 1) {}

                            Do
                                bytesRead = serverStream.Read(buffer, 0, bufferSize)
                                If bytesRead = 0 Then
                                    Exit Do
                                End If

                                memStream.Write(buffer, 0, bytesRead)
                                memStream.Flush()
                            Loop While True

                            serverStream.Close()
                        End Using

                        memStream.Seek(0, SeekOrigin.Begin)

                        Dim bmpImage As BitmapImage = New BitmapImage()
                        With bmpImage
                            .BeginInit()
                            .StreamSource = memStream
                            .CreateOptions = BitmapCreateOptions.None
                            .CacheOption = BitmapCacheOption.OnLoad
                            .EndInit()
                            .Freeze()
                        End With

                        Me.imgView.Source = bmpImage

                        memStream.Close()
                    End Using
                End If

                transaction.Commit()
            Catch ex As Exception
            Try
                'attempt to rollback the transaction
                transaction.Rollback()
            Catch ex2 As Exception
                'rollback was unsuccessful (connection may be closed)
            End Try

            MessageBox.Show(ex.Message)
        Finally
            If reader IsNot Nothing AndAlso Not reader.IsClosed Then reader.Close()
        End Try
        End Using
    End Sub

 

So there you have it.  The basics of writing and reading to a FILESTREAM on a SQL 2008 Server using ADO.NET.  Below you’ll find my scripts and .NET source code.

SqlFileStreamExample_Soln.zip (559.27 kb)
 

Bookmark and Share

SQL 2008 FILESTREAM

July 24, 2010 05:25 by wjchristenson2

The FILESTREAM feature of SQL 2008 allows more effective storage and retrieval to BLOB data.  It accomplishes this by utilizing both SQL 2008 and the NTFS file system.  In the past, most developers stored BLOB data in the SQL Server using varbinary(max).  This has some drawbacks including:  you had a 2GB max, streaming performance was awful, and recovery from fragmentation was poor.  The FILESTREAM can hold up to the size of the NTFS volume size, streaming performance is excellent, and it can recover from fragmentation on disk nicely.

The one drawback that I can see is when the client needs to perform small & frequent BLOB updates.  Rather than writing to SQL Server data files, FILESTREAM writes to the file system.  Locating, creating, updating, deleting files outside of the SQL Server data files do come at a cost.  So some consideration as to the size & frequency of BLOB data would need to be made.

There are a few ways to enable FILESTREAM (SQL Server Configuration Manager, SQL Server Management Studio properties of the connected instance, and via TSQL scripting).  Personally I was only able to get it to work by using the SQL Server Configuration Manager.  Apparently installing Visual Studio/SQL 2005 can screw up FILESTREAM features of SQL 2008…  Anyway, here’s the basic FILESTREAM access levels which you’ll need to understand when enabling it.

FILESTREAM access levels:
a) 0 = Disables FILESTREAM support for this instance.
b) 1 = Enables FILESTREAM for Transact-SQL access.
c) 2 = Enables FILESTREAM for Transact-SQL and Win32 streaming access.

I created a script which enables FILESTREAM for the SQL Instance, then it enables it for my “testdb” database.  It auto-detects where your database’s data/log files reside and then creates a directory in there for FILESTREAM objects.

 

USE [master];
GO
DECLARE @filestream_access_level SMALLINT;
SELECT @filestream_access_level = CONVERT(SMALLINT, SERVERPROPERTY('FilestreamEffectiveLevel'));

IF @filestream_access_level <> 2
BEGIN
	EXEC sp_configure filestream_access_level, 2;
	RECONFIGURE WITH OVERRIDE;
END;
GO

USE [testdb];
GO
IF NOT EXISTS (SELECT * FROM sys.filegroups AS fg WITH (NOLOCK) WHERE name = 'testdb_FILESTREAM')
BEGIN
	--add a file group and indicate that it will contain FILESTREAM objects
	ALTER DATABASE testdb ADD FILEGROUP testdb_FILESTREAM CONTAINS FILESTREAM;
END;
GO
IF NOT EXISTS (SELECT *
	FROM sys.filegroups AS fg WITH (NOLOCK)
		INNER JOIN sys.database_files AS df WITH (NOLOCK) ON fg.data_space_id = df.data_space_id
	WHERE fg.name = 'testdb_FILESTREAM'
		AND df.name = 'testdb_FILESTREAM')
BEGIN
	DECLARE @path NVARCHAR(MAX), @sql_statement NVARCHAR(MAX);
	SELECT @path = mf.physical_name FROM sys.master_files AS mf WITH (NOLOCK) WHERE mf.name = 'testdb' AND mf.type_desc = 'ROWS';
	SELECT @path = SUBSTRING(@path, 1, LEN(@path) - CHARINDEX('\', REVERSE(@path))) + '\testdb_FILESTREAM';

	--tell the file group where the FILESTREAM objects should be stored  
	SELECT @sql_statement = N'ALTER DATABASE testdb ADD FILE (NAME = ''testdb_FILESTREAM'', FILENAME = ''' + @path + ''') TO FILEGROUP testdb_FILESTREAM;';

	EXEC sp_executesql @statement = @sql_statement;
END;
GO

 

At this point I was ready to design a table in SQL Management Studio with a FILESTREAM column.  A FILESTREAM column is actually a varbinary(max) column with the FILESTREAM attribute.  Well, the table designer in SQL Management Studio does not support specifying the FILESTREAM attribute.  You have to script it.

Here’s some rules which you’ll need to adhere to when creating FILESTREAM columns.
1) The table can have multiple FILESTREAM columns, but they must all be in the same FILEGROUP.
2) If you don’t specify the FILESTREAM_ON clause, the default FILEGROUP will be used.
3) The table must have a uniqueidentifier column with the ROWGUIDCOL attribute.  It cannot contain NULL values and must have either a UNIQUE or PRIMARY KEY single column constraint.

 

USE [testdb];
GO 
IF NOT EXISTS (SELECT * FROM sys.tables AS t WITH (NOLOCK) WHERE t.name = 'Document' AND t.type = 'U')
BEGIN
	CREATE TABLE dbo.Document (
		DocumentID INT IDENTITY PRIMARY KEY,
		DocumentGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID(),
		Document VARBINARY(MAX) FILESTREAM NULL
	) ON [PRIMARY];
END;
GO

 

In my next post, I will show .NET streaming capabilities with the new FILESTREAM features in SQL 2008.
 

Bookmark and Share