Yet another SQL Server enthusiast

2014-01-13

Conditional joins

Let’s start with following example (based on Supertypes and Subtypes modeling approach) which creates three tables:

  1. dbo.BankAccountOwner (BankAccountOwnerID-PK, OwnerType, CreateDate)
  2. dbo.Person (BankAccountOwnerID – PK, FK, FirstName, LastName)
  3. dbo.Company (BankAccountOwnerID – PK, FK, CompanyName)

T-SQL Script:

Code Snippet
CREATE TABLE dbo.BankAccountOwner (
    BankAccountOwnerID INT PRIMARY KEY,
    OwnerType CHAR(1) NOT NULL CHECK( OwnerType IN ('P', 'C') ), — P=Person, C=Company
    CreateDate DATETIME NOT NULL DEFAULT (GETDATE())
);
CREATE TABLE dbo.Person (
    BankAccountOwnerID INT PRIMARY KEY
        REFERENCES dbo.BankAccountOwner(BankAccountOwnerID),
    FirstName NVARCHAR(50) NOT NULL,
    LastName NVARCHAR(50) NOT NULL
);
CREATE TABLE dbo.Company (
    BankAccountOwnerID INT PRIMARY KEY
        REFERENCES dbo.BankAccountOwner(BankAccountOwnerID),
    CompanyName NVARCHAR(100) NOT NULL
);

INSERT    dbo.BankAccountOwner (BankAccountOwnerID, OwnerType)
VALUES    (1, 'P'), (2, 'P'), (3, 'C'), (4, 'C'), (5, 'C');
INSERT    dbo.Person (BankAccountOwnerID, FirstName, LastName)
VALUES    (1, N'John', N'Doe'), (2, N'Mary', N'Doe');
INSERT    dbo.Company (BankAccountOwnerID, CompanyName)
VALUES (3, N'MyComputer'), (4, N'Control Panel'), (5, N'Device Manager');
GO

Problem: how can we get FirstName, LastName and CompanyName values for the following owners: 1, 2, 3, 4, 5 ?

First solution:

Code Snippet
SET NOCOUNT ON;
SET STATISTICS IO ON;
PRINT 'Test #1';
SELECT    bao.*, p.FirstName, p.LastName, c.CompanyName
FROM    dbo.BankAccountOwner bao
LEFT JOIN dbo.Person p ON bao.BankAccountOwnerID = p.BankAccountOwnerID
LEFT JOIN dbo.Company c ON bao.BankAccountOwnerID = c.BankAccountOwnerID
WHERE    bao.BankAccountOwnerID IN (1,2,3,4,5);

My solution:

Because

  1. dbo.Person table contains only rows with OwnerType = ‘P’ and dbo.Company table contains only rows with OwnerType = ‘C’ and
  2. SQL Server doesn’t knows this

I added these predicates to every left join thus:

Code Snippet
PRINT 'Test #2';
SELECT    bao.*, p.FirstName, p.LastName, c.CompanyName
FROM    dbo.BankAccountOwner bao
LEFT JOIN dbo.Person p ON bao.OwnerType = 'P' AND bao.BankAccountOwnerID = p.BankAccountOwnerID
LEFT JOIN dbo.Company c ON bao.OwnerType = 'C' AND bao.BankAccountOwnerID = c.BankAccountOwnerID
WHERE    bao.BankAccountOwnerID IN (1,2,3,4,5);

These are the execution plans:

Conditional.Joins.Execution.Plans

First solution: as you can see from the properties of Index Seek operators SQL Server will try to find every bank account owner (1, 2, …, 5) within dbo.Person table and within dbo.Company table (Number of Execution  = 5).

Second solution: this time because SQL Server knows that dbo.Person table contains only OwnerType = ‘P’ rows it will read from dbo.Person table using an Index Seek (on Person.PK_Person_…) just two times (Number of Executions = 2) because there are only two persons among those five owners. This was possible because of Filter operator which will verify before executing Index Seek if the current owner is ‘P’:

Filter.Operator

Also, SQL Server will read from dbo.Company table using an Index Seek (on Company.PK_Company_…) three times (Number of Executions = 3) because there are three companies.

Because we have reduced the number of execution for Index Seek operators, this is means that we have reduced also the number of logical reads as we can see from the output of STATISTICS IO ON:

Code Snippet
Test #1
Table 'Company'. Scan count 0, logical reads 10
Table 'Person'. Scan count 0, logical reads 10
Table 'BankAccountOwner'. Scan count 5, logical reads 10
Test #2
Table 'Company'. Scan count 0, logical reads 6
Table 'Person'. Scan count 0, logical reads 4
Table 'BankAccountOwner'. Scan count 5, logical reads 10

Blog at WordPress.com.