Entity Framework with Identity Insert

Entity Framework is great. I've used it since its' first public beta. And with each release, its gets better and better.

Code First is awesome. Being able to generate a database from pure code was great. Then they added a hybird version of this - you could generate Code First from an Existing Database. I remember screaming because I could have used that very feature the week before it got released - would have saved me a weeks work. :)

I recently wrote an program to bulk insert test data into a blank version of a production database.

The program used EF and a whole bunch of logic to check and verify data integrity before inserting data into the database.

To start with I used AddRange(). The AddRange method does exactly what it says on the tin - it'll insert all the data passed to it into the database.

However - before inserting anything, EF has an eleborate set of checks to go through to verify that the inserted data against the model it has of the database. Chief among these checks is that of the primary keys. If a primary key is an integer and a Computed column, EF will not insert it, rather leaving it up to the database to assign a primary key.

But what if you want to force EF to insert your Primary Key? Generally speaking, we want this if we are importing multiple tables and wish to preserve the Foreign Key relationships between them. Which was exactly what I wanted.

So, how do we do this?

Step 1

In your Code First model (or your .edmx file for Database First) the primary keys should not have any attibutes, or they should be set to [DatabaseGenerated(DatabaseGeneratedOption.Identity)].

All the primary keys that you want to Force to be inserted should be changed to [DatabaseGenerated(DatabaseGeneratedOption.None)] This ensures that EF will not try and change the values and will simply insert them with the rest of the data.

If you go ahead and try and insert the data now you'll get an error says that Identity Insert needs to be enabled.

Step 2 A

So. We need to execute a SQL statement against the database. Basically the SQL loos like this: SET IDENTITY_INSERT [dbo].[TableName] ON

So we'll try this code:

context.Database.ExecuteSqlCommand(@"SET IDENTITY_INSERT [dbo].[TableName] ON");
context.TableName.AddRange(items);

context.SaveChanges();

context.Database.ExecuteSqlCommand(@"SET IDENTITY_INSERT [dbo].[TableName] OFF");

Now, if you try and run that, you'll get the exact same Identity Insert error.

The code looks legit! Why is this happening?

Basically - Identity Insert is scope limited. It is only enabled for the scope of a single transaction. Mainly to prevent you forgetting to turn it back off after you're finished and causing choas.

So the second call to Identity insert is superfluous. But I include it for readabilities sake.

Step2 B

So we need a way of executing the EF AddRange() and SaveChanges() methods in the same transaction scope as our Identity Insert ON statement.

Fortunately there is a way. Enter the TransactionScope class.

using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
      {
      using (var conn = new System.Data.SqlClient.SqlConnection(_connectionstring))
            {
                 conn.Open();
                 using (Context context = new Context(conn, false))
                        {
                            context.Database.ExecuteSqlCommand(@"SET IDENTITY_INSERT [dbo].[TableName] ON");
                            context.TableName.AddRange(items);

                            context.SaveChanges();

                            context.Database.ExecuteSqlCommand(@"SET IDENTITY_INSERT [dbo].[TableName] OFF");
                            scope.Complete();
                          }
                 }
       }

So what are we doing here?

We first create a TransactionScope object in a using block. The using block ensures that the TransactionScope Object will be destroyed once the flow of control leaves the using block.

In the body of the using block we create another using block and create a new SqlConnection Object and pass it our connection string. The SQLConnection will automatically be enrolled in the Transaction.

We create our third using block and we create a new version of our EF Database Context and pass it our open SqlConnection object for it to use internally. Note that we also pass false as a second parameter to the context constructor. This tells EF that it does not own the SqlConnection object and ensures that it plays nicely with the TransactionScope.

Except for the last line of the using block, the rest of the code is the same as our earlier attempt. The last line calls scope.Complete().

This closes the scope and commits the transaction. We then exit all three using blocks.

At this point all the objects we've created have been destroyed. We do this because we want to clean up objects we won't be using and free up resources. And also because using two EF Contexts can be dangerous. Although its not in the code snippit above, its likely that you will have one EF Context to do all the operations on the database that don't require this special treatment along with the second Context that we create to force the Primary keys to be inserted.

So by destoying the second one, we avoid any confusion between them. A quick look at Stackoverflow shows that this is more of an issue than you might want to believe.

The result of this code is that the data would have been sucessfully added to the table with its primary keys intact.

Success!!

(written mainly for my future self to rediscover this at an acute moment of crisis)