Yet another SQL Server enthusiast

2014-03-16

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

Filed under: concurrency,EXISTS,safe,UPSERT — sa @ 1:25 PM

I presented here and here the reasons why one of the standard approach used for UPSERT

Code Snippet
IF EXISTS(SELECT * FROM dbo.MyTable WHERE ID = @param1)
UPDATE    dbo.MyTable
SET        Col1 = @param2
WHERE    ID = @param1
ELSE
INSERT    dbo.MyTable(ID, Col1)
VALUES    (@param1, @param2)

isn’t safe from concurrency point of view if we use default setting for transactions isolation level: read committed.

This means that

  1. We might get duplicate values (see #2) if we don’t use an unique index or
  2. We might get “Cannot insert duplicate key row…” errors if we use an unique index to prevent duplicate values. In this last blog on this subject and I want to show you how simple is to get above error.

The only difference between this test and previous test is the next index:

Code Snippet
  1. CREATE UNIQUE NONCLUSTERED INDEX [IUN_Customer_Email]
  2. ON [dbo].[Customer] ([Email] ASC)

The ostress.exe tool is executed with the same parameters:

sqlserver.ostress.upsert.test

The parameter -o”d:\SqlOstressOutput” represents the output folder. This folder is used to write the results of every T-SQL script and connection plus ostress.log file which collects overall results.

This time, this unique index prevents insertion of duplicate emails:

Code Snippet
  1. CustomerID  Email Note OrdersCount
  2. ———– —– —- ———–
  3. 1           a@a.a NULL 1
  4. 2           b@b.b NULL 1001
  5. 4           c@c.c NULL 995
  6. 9           d@d.d NULL 994

and, also, D:\SqlOstressOutput\ostress.log contains some errors generated by T-SQL scripts because at some moments they try to insert duplicate emails (see lines 34 – 46) but IUN_Customer_Email index prevents duplicate emails :

Code Snippet
  1. 03/13/14 15:10:37.875 [0x00002878] OSTRESS, A Generic ODBCBased Stress/Replay Utility.
  2. Version 9.04.0004 built for x64.
  3.   Copyright19972013 Microsoft. All Rights Reserved
  4. 03/13/14 15:10:37.876 [0x00002878]             Computer: SGBD
  5. 03/13/14 15:10:37.877 [0x00002878]          Base Module: C:\\cf2 Program Files\\cf2 Microsoft Corporation\\cf2 RMLUtils\\cf2 ostress.exe
  6. 03/13/14 15:10:37.877 [0x00002878]           Process Id: 6848
  7. 03/13/14 15:10:37.877 [0x00002878]  Active proc mask(0): 0x0000000F
  8. 03/13/14 15:10:37.877 [0x00002878]         Architecture: 9
  9. 03/13/14 15:10:37.878 [0x00002878]            Page size: 4096
  10. 03/13/14 15:10:37.878 [0x00002878]                 CPUs: 4
  11. 03/13/14 15:10:37.878 [0x00002878]     Processor groups: 1
  12. 03/13/14 15:10:37.879 [0x00002878]         Highest node: 0
  13. 03/13/14 15:10:37.879 [0x00002878] —————————————
  14. 03/13/14 15:10:37.879 [0x00002878]                Group: 0
  15. 03/13/14 15:10:37.880 [0x00002878] —————————————
  16. 03/13/14 15:10:37.880 [0x00002878]         Processor(s): 0x00000003 Function units: Shared
  17. 03/13/14 15:10:37.881 [0x00002878]         Package mask: 0x0000000F
  18. 03/13/14 15:10:37.882 [0x00002878]         Processor(s): 0x0000000C Function units: Shared
  19. 03/13/14 15:10:37.884 [0x00002878]         Processor(s): 0x0000000F assigned to Numa node: 0
  20. 03/13/14 15:10:37.887 [0x00002878] Current time bias: 120 minutes 2.00 hours DST Standard
  21. 03/13/14 15:10:37.888 [0x00002878] Max threads setting: 10000
  22. 03/13/14 15:10:37.889 [0x00002878] Arguments:
  23. 03/13/14 15:10:37.891 [0x00002878] S(local)\\cf2 SQL2012
  24. 03/13/14 15:10:37.892 [0x00002878] E
  25. 03/13/14 15:10:37.893 [0x00002878] dTestUPSERT
  26. 03/13/14 15:10:37.894 [0x00002878] od:\\cf2 SqlOstressOutput
  27. 03/13/14 15:10:37.896 [0x00002878] id:\\cf2 SqlOstressInput\\cf2 SQLQuery*.sql
  28. 03/13/14 15:10:37.897 [0x00002878] n10
  29. 03/13/14 15:10:37.898 [0x00002878] r100
  30. 03/13/14 15:10:37.900 [0x00002878] q
  31. 03/13/14 15:10:37.901 [0x00002878] Using language id (LCID): 1024 [English_United States.1252] for character formatting with NLS: 0x00060101 and Defined: 0x00060101
  32. 03/13/14 15:10:37.903 [0x00002878] Default driver: SQL Server Native Client 11.0
  33. 03/13/14 15:10:39.857 [0x0000327C] [spid 69] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  34. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  35.  
  36. 03/13/14 15:10:39.857 [0x00001988] [spid 71] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  37. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (c@c.c).
  38.  
  39. 03/13/14 15:10:39.858 [0x00000A14] [spid 54] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  40. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  41.  
  42. 03/13/14 15:10:39.858 [0x00003888] [spid 79] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  43. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  44.  
  45. 03/13/14 15:10:39.874 [0x0000038C] [spid 62] SQLState: 23000, Native Error: 2601, Severity: 14, State: 1, Line: 14
  46. [SQL Server]Cannot insert duplicate key row in object ‘dbo.Customer’ with unique index ‘IUN_Customer_Email’. The duplicate key value is (d@d.d).
  47.  
  48. more errors

Solutions: to prevent these concurrency problems here I’ve presented few solutions. Please bear in mind that you should rigorous test selected solution. Next blog will present pros and cons for every solution.

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Advertisements

2014-03-04

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Filed under: concurrency,EXISTS,safe,UPSERT — sa @ 8:45 AM

One of previous articles showed how easily we might get duplicate rows if we use following approach for UPSERT:

Code Snippet
IF EXISTS(SELECT  FROM MyTable WHERE ID = @param1)
    UPDATE    MyTable
    SET     Col1 = NewValue
    WHERE   ID = @param1
ELSE
    INSERT  MyTable(ID, Col1)
    VALUES  (@param1, @param2)    

This time I decided to do other tests but without WAITFOR. I’ve used following table (the same table) without any non-clustered indexes (like IUN_Customer_Email) and with just two rows:

Code Snippet
CREATE DATABASE TestUPSERT
GO

DBCC FREEPROCCACHE;
GO

USE TestUPSERT
GO

IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

INSERT dbo.Customer (Email) VALUES (N'a@a.a');
INSERT dbo.Customer (Email) VALUES (N'b@b.b');
GO

For my test I’ve used three scripts which try to insert or updated multiple times three emails: b@b.b (SQLQuery1.sql), c@c.c (SQLQuery2.sql) and d@d.d (SQLQuery3.sql). These files are available in D:\SqlOstressInput folder.

Content of SQLQuery{1|2|3}.sql script:

Code Snippet
— Parameters
DECLARE @Email NVARCHAR(100);
SET @Email = 'b@b.b'; — or 'c@c.c' or 'd@d.d'
— End of Parameters
BEGIN TRANSACTION;
    IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
    BEGIN
        UPDATE  dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE   Email = @Email;
    END
    ELSE
    BEGIN
        INSERT    dbo.Customer(Email)
        VALUES    (@Email) — OrdersCount = Default 1
    END    
COMMIT;

To run these scripts multiple times (100 times) using 10 concurrent connections I’ve used ostress.exe which is included into RML utilities for SQL Server. Before test I’ve used DBCC DROPCLEANBUFFERS and for ostress.exe I’ve used following arguments:

Code Snippet
[RML] C:\Program Files\Microsoft Corporation\RMLUtils>
ostress -S(local)\SQL2012 -E -dTestUPSERT -o"d:\SqlOstressOutput" -i"d:\SqlOstressInput\SQLQuery*.sql" -n10 -r100 -q

sqlserver.ostress.upsert.test

After few tries I succeeded to get duplicate rows (sample output, you might get different results) and for some emails (d@d.d and c@c.c) the OrderCount is wrong:

Code Snippet
CustomerID  Email Note OrdersCount
———– —– —- ———–
1           a@a.a NULL 1
2           b@b.b NULL 1001
3           d@d.d NULL 999
4           d@d.d NULL 999
5           c@c.c NULL 996
6           c@c.c NULL 996
7           c@c.c NULL 996
8           c@c.c NULL 996
9           c@c.c NULL 996

Conclusion: This approach for UPSERT (without any unique indexes and table hints or without serializable transaction isolation level) isn’t safe.

Note: for this test I used

Code Snippet
Microsoft SQL Server 2012 (SP1) – 11.0.3000.0 (X64)
    Oct 19 2012 13:38:57
    Copyright (c) Microsoft Corporation
    Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Transaction isolation level was READ COMMITTED (default).

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

2014-01-22

UPSERT – safe solutions for SQL Server [DRAFT]

Filed under: concurrency,EXISTS,MERGE,sql server,UPSERT — sa @ 9:39 PM

This article tackles how to implement safe UPSERTs from concurrency point of view. A previous blog has presented a common solution for UPSERT and it has demonstrated that this solution isn’t safe. Current blog specifies common solutions for UPSERT operations with a single row.

To discuss these solutions I will use the same table dbo.Customer:

Code Snippet
IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

If we want to insert a new customer then the identifier used for customers is the email address. If there is no email address stored within dbo.Customer then we need to insert a new customer with 1 order. Otherwise (the customer’s email already exists) we need to increment by 1 the OrdersCount column.

From this point of view we need also an unique index on Email column:

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
ON dbo.Customer(Email);

This index has two important roles:

  1. It prevents duplicated emails and
  2. It helps those queries which tries to find customers using the email address as identifier (WHERE Email = constant / @variable).

Solutions for a single row

Code Snippet
— Solution #1: IF EXISTS
— Parameters
DECLARE @Email NVARCHAR(100)
SET @Email = 'b@b.c';
— End of Parameters
BEGIN TRANSACTION;
    IF EXISTS(SELECT * FROM dbo.Customer WITH(XLOCK, HOLDLOCK) WHERE Email = @Email)
    BEGIN
        UPDATE  dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE   Email = @Email;
    END
    ELSE
    BEGIN
        INSERT    dbo.Customer(Email)
        VALUES    (@Email) — OrdersCount = Default 1
    END    
COMMIT; — Use ROLLBACK for testing
GO

— Solution #2: MERGE
— Parameters
DECLARE @Email NVARCHAR(100)
SET @Email = 'b@b.b';
— End of Parameters
BEGIN TRANSACTION;
    MERGE    dbo.Customer WITH(HOLDLOCK) trg
    USING    (VALUES(@Email)) src(Email) ON trg.Email = src.Email
    WHEN MATCHED
        THEN    
        UPDATE
        SET        OrdersCount = trg.OrdersCount + 1
    WHEN NOT MATCHED BY TARGET
        THEN    
        INSERT (Email)
        VALUES    (@Email);
COMMIT;
GO

— Solution #3: UPDATE + INSERT
— Parameters
DECLARE @Email NVARCHAR(100)
— End of Parameters
SET @Email = 'b@b.b';
BEGIN TRANSACTION;
    UPDATE    trg
    SET        OrdersCount = OrdersCount + 1
    FROM    dbo.Customer trg WITH(HOLDLOCK)
    WHERE    trg.Email = @Email;

    IF @@ROWCOUNT = 0
        INSERT    dbo.Customer(Email)
        VALUES    (@Email);        
COMMIT;
GO

Solutions for multiple rows

Code Snippet
— Solution #1: EXISTS
— Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'a@a.a');
INSERT    @CustomerEmail (Email) VALUES (N'c@c.c');
INSERT    @CustomerEmail (Email) VALUES (N'd@d.d');
— End of Parameters
DECLARE @CustomerEmailForUPDATE TABLE(
    Email NVARCHAR(100) NOT NULL PRIMARY KEY
);

BEGIN TRANSACTION;
    INSERT    @CustomerEmailForUPDATE (Email)
    SELECT    ce.Email
    FROM    @CustomerEmail ce
    WHERE    EXISTS(SELECT * FROM dbo.Customer c WITH(XLOCK, HOLDLOCK) WHERE c.Email = ce.Email);

    UPDATE  dbo.Customer
    SET        OrdersCount = OrdersCount + 1
    FROM    dbo.Customer c
    INNER JOIN @CustomerEmailForUPDATE cefu ON c.Email = cefu.Email
    
    INSERT  dbo.Customer(Email)
    SELECT    ce.Email — OrdersCount = Default 1
    FROM    @CustomerEmail ce
    WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailForUPDATE cefu WHERE cefu.Email = ce.Email)
COMMIT; — Use ROLLBACK for testing
GO

— Solution #2: MERGE
— Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'a@a.a');
INSERT    @CustomerEmail (Email) VALUES (N'c@c.c');
INSERT    @CustomerEmail (Email) VALUES (N'd@d.d');
— End of Parameters
BEGIN TRANSACTION;
    MERGE    dbo.Customer WITH(HOLDLOCK) trg
    USING    @CustomerEmail src ON trg.Email = src.Email
    WHEN MATCHED
        THEN    
        UPDATE
        SET        OrdersCount = trg.OrdersCount + 1
    WHEN NOT MATCHED BY TARGET
        THEN    
        INSERT (Email)
        VALUES    (src.Email);
COMMIT; — Use ROLLBACK for testing
GO

— Solution #3: UPDATE + INSERT
— Parameters
DECLARE @CustomerEmail TABLE( Email NVARCHAR(100) NOT NULL PRIMARY KEY );
INSERT    @CustomerEmail (Email) VALUES (N'a@a.a');
INSERT    @CustomerEmail (Email) VALUES (N'c@c.c');
INSERT    @CustomerEmail (Email) VALUES (N'd@d.d');
— End of Parameters

DECLARE @CustomerEmailUPDATED TABLE(
    Email NVARCHAR(100) NOT NULL PRIMARY KEY
);
BEGIN TRANSACTION;
    UPDATE    trg
    SET        OrdersCount = OrdersCount + 1
    OUTPUT    inserted.Email INTO @CustomerEmailUPDATED
    FROM    dbo.Customer trg WITH(HOLDLOCK)
    INNER JOIN @CustomerEmail src ON trg.Email = src.Email;

    INSERT    dbo.Customer(Email)
    SELECT    ce.Email
    FROM    @CustomerEmail ce
    WHERE    NOT EXISTS(SELECT * FROM @CustomerEmailUPDATED upd WHERE upd.Email = ce.Email);
COMMIT;
GO

The next blog

  • will present solutions for scenarios where we need to insert or update multiple rows (2014-02-18) and
  • will discuss pros and cons of these solutions.

Revision History:

2014-02-08: First update. I have added a new section for multiple rows. Solutions for a single row: some minor changes.

2014-01-05

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ?

Filed under: concurrency,EXISTS,index,MERGE,safe,sql server,UPSERT — sa @ 10:47 PM

Is it safe the following approach for UPSERT ?

Code Snippet
IF EXISTS (SELECT FROM MyTable WHERE ID = @param1)
    UPDATE    MyTable
    SET        Col1 = NewValue
    WHERE    ID = @param1
ELSE
    INSERT    MyTable(ID, Col1)
    VALUES    (@param1, @param2)

The short answer is no!

Example:

  1. The transaction isolation level is READ COMMITTED which is the default isolation level.
  2. For this demo I’ve used a customer table to store the customer’s email and the number of sales orders (for every customer).
  3. I’ve inserted two customers (1 – a@a.a; 2 – b@b.b).
  4. I’ve created a stored procedure for UPSERT (dbo.InsertOrder). This stored procedure receive the customer’s email, it checks if exists this email and if not then it inserts the new customer (sales orders 1). Otherwise it increments the number of sales orders by 1.

T-SQL script:

Code Snippet
IF OBJECT_ID(N'dbo.Customer') IS NOT NULL
    DROP TABLE dbo.Customer;
IF OBJECT_ID(N'dbo.InsertOrder') IS NOT NULL
    DROP PROCEDURE dbo.InsertOrder;
GO

CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    Email NVARCHAR(100) NOT NULL,
    Note NVARCHAR(50) NULL,    
    OrdersCount INT NOT NULL DEFAULT (1)
);
GO

INSERT    dbo.Customer (Email)
VALUES    (N'a@a.a'), — CustomerID 1
        (N'b@b.b'); — CustomerID 2
GO

CREATE PROCEDURE dbo.InsertOrder (
    @Email NVARCHAR(100)
)
AS
BEGIN
    IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email)
    BEGIN
        UPDATE    dbo.Customer
        SET        OrdersCount = OrdersCount + 1
        WHERE    Email = @Email;
    END
    ELSE
    BEGIN
        WAITFOR DELAY '00:00:03';
        INSERT    dbo.Customer (Email)
        VALUES    (@Email) — OrdersCount = Default 1
    END
END;
GO

Note: WAITFOR DELAY is used to delay the INSERTion of the new customers with just 3 seconds.

Now we can start this test executing following statements in separate windows:

Is.NOT.Safe.If.Exists.Update.Insert.UPSERT

Query window #1 (Step 1):

Code Snippet
BEGIN TRANSACTION
    UPDATE    dbo.Customer
    SET        Note = N'Note#2'
    WHERE    Email = N'b@b.b';
— ROLLBACK — or COMMIT

At this moment the CustomerID = 2 is locked X (exclusive) and no other transaction can read (excepting NOLOCK and READ UNCOMMITTED), update or delete this record. Because this transaction remains open (no COMMIT or ROLLBACK) this record will be locked till the end of transaction.

Query window #2 & #3 (Step 2 & Step 3):

Code Snippet
EXEC dbo.InsertOrder N'e@e.e';
SELECT * FROM dbo.Customer;

At this moment the IF EXISTS(SELECT * FROM dbo.Customer WHERE Email = @Email) statement will try to read the record with CustomerID = 1 and then the record with CustomerID = 2 requesting a S lock (shared). Because X and S locks aren’t compatible (see Lock Compatibility) both (Step 2 & 3) SELECT * FROM dbo.Customer WHERE Email = @Email statements will be blocked. Note: you can see the “Executing ….” message within tab title.

Query window #1 (Step 4):

I execute the ROLLBACK or COMMIT statement which ends the initial transaction. The X lock for CustomerID = 2 record is released and the SELECT statements (SELECT * FROM dbo.Customer WHERE Email = @Email; Step 2 & 3)  continue to be executed. Every EXISTS(SELECT …. WHERE Email = ‘e@e.e’) statement will return FALSE and both stored procedures will INSERT the customer `e@e.e`.

This is the reason why I get duplicated emails at the end (Step 5) of my example:

Results.Step5Solutions ?

Simplest solution is to create an unique index/constraint on Email column.

Code Snippet
CREATE UNIQUE INDEX IUN_Customer_Email
ON dbo.Customer(Email);

This UNIQUE index/constraint guarantees that we can’t have duplicated emails. But this doesn’t means this UPSERT approach is 100% safe because dbo.InsertOrder stored procedure still tries to INSERT duplicated emails (at Step 5):

Results.Step5.With.UNIQUE.index

Next blog On Jan 27 I will discuss what solutions do we have for safe UPSERTs.

See also:

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #2

Is it safe IF EXISTS – UPDATE ELSE INSERT [UPSERT] ? #3

Revision History:

2014-01-12: I changed the last paragraph.

2014-01-27: I changed the last paragraph.

2014-04-06: I added the [See also] section

Create a free website or blog at WordPress.com.