Multiple Inline Constraints

SQL New Blogger:

Time to investigate: 10 mins 
Time to test: 10 mins
Time to write: 10 mins

While creating a script for some new tables I came across a few columns that were designated to have both CHECK constraints and DEFAULT constraints.

Now this isn’t a problem of itself, it can be easily achieved by using a CREATE TABLE statement and then using 2 ALTER TABLE statements to create the constraints.

Old Style:

The problem that I had with this was that, so far, I was going along and creating these tables & columns with the constraints created in-line and it just galled me to have to break this flow and create these constraints as ALTER statements.

Checking the examples in the new Microsoft Docs didn’t show any examples that I could find of creating both constraints together on the same column so I experimented and found out that you can!

Here’s how…

New Style:

Notice 2 things here:

  1. There is no need to specify a FOR <column name> on the default constraint because SQL Server can tell the constraint is to work on the column it is currently defining.
  2. There is no comma separating the two constraints. This would break the inline property of these statements and SQL Server would think you’ve messed up syntax on a constraint (this got me for a sec).

Great, I can keep my constraints inline!

That’s a wrap

Documentation is useful but they do not cover every situation. Have a test environment; Hypothesize, test, and verify. You never know what you’d find.

Semi-Unique Constraints.

When full uniqueness is a bit too much.

What are Unique Constraints? How can we enforce them and finally, is there anyway that we can have modified uniqueness…can we have Semi Unique Constraints?

Unique constraints.

There are many use cases that require that a column in a table only have unique values. The main example of this is a table’s PRIMARY KEY which, needing to uniquely identify a record, needs to be unique by default.

In fact, SQL Server enforces this uniqueness by creating a unique index on whatever column(s) the Primary key is defined on.


-- Drop Table if Exists.
IF OBJECT_ID(N'dbo.SemiUniqueColumn', N'U') IS NOT NULL
DROP TABLE dbo.SemiUniqueColumn;

-- Create Table.
CREATE TABLE dbo.SemiUniqueColumn
(
ID int IDENTITY(1, 1),
UniqueColumn int,
SemiUniqueColumn int,
CONSTRAINT [PK_SemiUniqueColumn] PRIMARY KEY CLUSTERED (ID)
);

-- Check Primary Key Exists.
EXEC sp_helpindex N'dbo.SemiUniqueColumn';

SemiUniqueConstraints_PK
Index Description : unique.

Unique But Not Primary

Primary keys are not the only options that can be unique, SQL Server recognizes this, and so there is the option of marking other columns as unique as well. Whether this be actioned by a UNIQUE CONSTRAINT or a UNIQUE INDEX is user’s perogative.

I’ll be creating this using a UNIQUE INDEX but for the purpose of completeness, the syntax for UNIQUE CONSTRAINTs is


ALTER TABLE table_name
ADD CONSTRAINT constraint_name UNIQUE (column(s))

Now say we are forced to ensure that the column UniqueColumn is unique, so we create a UNIQUE INDEX on this.

-- Ensure UniqueColumn is Unique by Creating Unique Index on it.
CREATE UNIQUE NONCLUSTERED INDEX [UIX_SemiUniqueColumn_UniqueColumn] ON dbo.SemiUniqueColumn (UniqueColumn);
SemiUniqueConstraints_UIX
Index Description : unique.

We insert values into this table and, as long as the uniqueness of these rows are satisfied, we’re going to have a good time.


-- Insert Data.
WITH RecordSet (SeqNum) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM sys.all_columns)
INSERT INTO dbo.SemiUniqueColumn (UniqueColumn) SELECT SeqNum FROM RecordSet ORDER BY SeqNum DESC;

-- Select Data.
SELECT ID, UniqueColumn, SemiUniqueColumn FROM dbo.SemiUniqueColumn;

SemiUniqueConstraints_TblSelect.png
All Unique = All Good.

When Is A NULL Not a NULL?

Short answer, when you try and compare it to another one.

Now you may think that I’ve lost the plot and gone off on a tangent but bear with me here a second and I’ll explain.

A NULL is UNKNOWN and an unknown value should not equal another unknown value. What happens if your first unknown turns out to be the number 1 and the second unknown is the number 2? 1 <> 2 so comparing them then would be pretty foolish.

If you ever look at some generated script and see at the very top of the script, hiding amongst the XACT_ABORTs and the NOCOUNTs is another option called ANSI_NULLs (although not anymore as it’s on by default and should be left that way IMHO).
This ensures that NULLs are treated as unknown and cannot be compared to another unknown except for some specific cases (GROUP BY, UNION ALL, INTERSECT, EXCEPT, etc)

What Does This Mean For Us?

Good question! Remember our Unique Index on UniqueColumn? What happens if we run the following?


-- Insert NULL into UniqueColumn.
INSERT INTO dbo.SemiUniqueColumn
( UniqueColumn )
VALUES
( NULL );

It inserts no problem, going all the way from 7442, 7441, … 2, 1, NULL. What about if we run it again?

SemiUniqueConstraints_UniqueColumn2ndNull
An Index is apparently exempt from NULL <> NULL.

Semi Unique Constraints.

Now ask yourself the question, what happens if we are told that our other column SemiUniqueColumn can have as many NULLs as it wants but if it gets a value, that value must be unique?


-- Generate Semi Unique Values.
-- Every Sixteenth Value is NULL.
UPDATE
original
SET
SemiUniqueColumn = IIF((original.UniqueColumn % 16) = 0, NULL, original.UniqueColumn)
FROM dbo.SemiUniqueColumn AS [created]
JOIN dbo.SemiUniqueColumn AS [original] ON created.ID = original.ID;

-- Select Data.
SELECT ID, UniqueColumn, SemiUniqueColumn FROM dbo.SemiUniqueColumn;

SemiUniqueConstraints_SemiUniqueColumn.png
Multiple NULLs But How to Force Uniqueness?

I used to think that this would be a complex requirement, possibly requiring a TRIGGER or two to check the inserted value against whatever is already there; but there is a way to have this functionality and have it the way that SQL Server normally would enforce a uniqueness on a column; by using a UNIQUE INDEX.

In case you’re thinking…

“Oh, a unique index doesn’t check what’s already there, is that it?”

I’m afraid that’s not the case.


-- Standard Unique Index Treats NULLs = NULLs.
CREATE UNIQUE NONCLUSTERED INDEX [UIX_SemiUniqueColumn_SemiUniqueColumn] ON dbo.SemiUniqueColumn (SemiUniqueColumn);

SemiUniqueConstraints_FailUniqueIndex.png
Won’t Create As Values Aren’t Unique

Yes, I know that we just showed that SQL Server treats NULLs as equal on Indexes but there is a way around this,  and that’s to not use a simple unique index.

We simply ignore the NULLs altogether by using a UNIQUE FILTERED INDEX.


-- Filtered Index.
CREATE UNIQUE NONCLUSTERED INDEX [UFIX_SemiUniqueColumn_SemiUniqueColumn] ON dbo.SemiUniqueColumn (SemiUniqueColumn)
WHERE SemiUniqueColumn IS NOT NULL;

Don’t trust me? Trust but verify!

First attempt to insert a duplicate value in SemiUniqueColumn:


-- Test it (Duplicate).
INSERT INTO dbo.SemiUniqueColumn (UniqueColumn, SemiUniqueColumn) VALUES (0, 7439);

SemiUniqueConstraints_FUI_Dup
7439 Is Already There So It’s Rejected.

Now we attempt to insert a duplicate value in SemiUniqueColumn but attempt to insert a duplicate NULL.


-- Test it (Not Duplicate).
INSERT INTO dbo.SemiUniqueColumn (UniqueColumn, SemiUniqueColumn) VALUES (0, NULL);

SemiUniqueConstraints_FUI_NotDup
NULL Is Already There But We Don’t Care.

Finally, NULL was chosen just for example purposes, filtering can be done on any value but, at the moment, there are a couple of got’cha’s with them…

Conclusion

I think you’ll agree that this would be so much easier to manage and troubleshoot than multiple triggers.

In fact, indexes are extremely useful aspects of databases and, once you start to realise how they work, you can start to use them for various different things than simple seeks and scans.

But the main aspect here is that complexity is not king. There are a multitude of ways to achieve something, so take a few moments to think about the problem you are trying to face and you’ll be thankful in the long run.

Primary Foreign Key

Correcting an incorrect assumption helped me learn about Query Optimizer shortcuts.

A couple of weeks ago I talked about Best Practices and how it was important to understand why that something was best practice.

Well another aspect to take from that post was the importance of knowing; if you do not know something, then it is important for you to learn it.

That being said something that I did not know, but recently learned, was that there is nothing stopping a Primary Key from also being a Foreign Key.

there is nothing stopping a Primary Key from also being a Foreign Key

Assumptions

When you think about it, this lack of knowledge came from incorrect assumptions. You read Primary KEY and Foreign KEY and you think to yourself, well they are both keys aren’t they? Same thing.

That is the trap that I fell down and the trap is not knowing and making invalid assumptions. So let’s hopefully help you with knowing what the differences between them are.

First let’s create our tables:


-- Create our Foo and Bar table.
IF OBJECT_ID(N'dbo.Bar', N'U') IS NOT NULL
DROP TABLE dbo.Bar;
GO
IF OBJECT_ID(N'dbo.Foo', N'U') IS NOT NULL
DROP TABLE dbo.Foo;
GO

CREATE TABLE dbo.Foo
(
 FooID int IDENTITY(1, 1) NOT NULL
 CONSTRAINT [PK_dbo_Foo] PRIMARY KEY CLUSTERED,
 FooValue char(8)
);
GO
CREATE TABLE dbo.Bar
(
 BarID int
 CONSTRAINT [PK_dbo_Bar] PRIMARY KEY CLUSTERED,
 BarValue char(8),
 CONSTRAINT [FK_dbo_Bar_dbo_Foo] FOREIGN KEY (BarID)
 REFERENCES dbo.Foo (FooID)
);
GO

-- Declare our holding table.
DECLARE @FooIDs TABLE (FooID int);
 -- Insert into Foo.
INSERT INTO dbo.Foo (FooValue)
 -- Gather the new ID's from foo.
OUTPUT inserted.FooID INTO @FooIDs (FooID)
SELECT LEFT(NEWID(), 8) FROM sys.all_objects;

-- Insert Foo's ID into the Bar table.
INSERT INTO dbo.Bar (BarID, BarValue)
SELECT FooID, RIGHT(NEWID(), 8) FROM @FooIDs;

-- Select our tables.
SELECT * FROM dbo.Foo;
SELECT * FROM dbo.Bar;
GO
FooAndBarFKPKTables
Foo And Bar selects

Primary Keys (PK)

a column or combination of columns that contain values that uniquely identify each row in the table

Primary key is a column or combination of columns that contain values that uniquely identify each row in the table.

That’s it; it just has to uniquely identify the row.

btw you are going to hear the word “unique” a lot with regard to Primary keys…

Now there are other types of keys that can do the same (Surrogate Keys, Composite Keys, Unique Keys, Alternate Keys, etc) but these are outside the scope of this post.

So if we attempt to insert another record into our Primary Key column/column combo that violates this unique, identifying property, we’re going to have a bad time.

PKViolation

We have to use IDENTITY_INSERT syntax because I’ve created the tables using IDENTITY and, if we were to insert a record into the identity column without turning IDENITY_INSERT on first, then another error pops up before the PK violation error that we want.

However, if we were to create our table without specifying the Primary Key constraint then the above insert would work and you would have duplicate entries populating your table, silently and deadly.

Foreign Keys (FK)

a column or combination of columns that is used to establish and enforce a link between the data in two tables to control the data that can be stored in the foreign key table

A Foreign key is a column or combination of columns that is used to establish and enforce a link between the data in two tables to control the data that can be stored in the foreign key table.

That’s it; it just has to establish and enforce a link between data.

If we try to violate this link, SQL Server will throw a different error and not let us.

FKViolation
If it is not in Foo, then it’s not getting in Bar

Yet if we were to create our table without specifying our Foreign key, then there would be no real link between our tables. So if our business depends on a record not being in Bar without being in Foo and we don’t have a constraint specified to that extent…

Unfortunately, I’m hard pressed to think of a way you can ensure this.

In fact, I don’t even like the above definition for Foreign keys as it states that two tables are necessary for a Foreign key constraint when only one is needed:

Example:


IF OBJECT_ID(N'dbo.HR', N'U') IS NOT NULL
DROP TABLE dbo.HR;
GO

CREATE TABLE dbo.HR
(
EmployeeID int
CONSTRAINT [PK_dbo_HR] PRIMARY KEY CLUSTERED,
FirstName varchar(20) NOT NULL,
SurName varchar(20) NOT NULL,
ManagerID int NULL
CONSTRAINT [FK_EmployeeID_Manager_ID] FOREIGN KEY
REFERENCES dbo.HR (EmployeeID)
);
GO

-- Check for foreign key
SELECT * FROM sys.foreign_keys WHERE [parent_object_id] = OBJECT_ID('dbo.HR');
GO

-- Check for primary key
SELECT * FROM sys.key_constraints WHERE [parent_object_id] = OBJECT_ID('dbo.HR');
GO

-- Check for everything.
EXEC sp_helpconstraint'dbo.HR';
GO

SameTableForeignKey
Foreign Key only involving 1 table.

 

Differences

If you check the two definitions for Primary key and Foreign key you’ll see that, even though they are both called keys, they serve two different purposes; namely identifying rows and enforcing links.

And those two purposes are not mutually exclusive!

A column/column combo that identifies a row can also be used to enforce a link back to another table (or itself, as shown above with Foreign keys).

The assumption, that if you were one then you couldn’t be the other, was incorrect. If your business rules call for it, don’t let a column being one type of key stop it from being the other.

Let’s not go to C to get to B

You may be thinking that this is a lot of hassle and that’s not an unfair thought.

Why not just not declare any key constraints and let the data fall as they may?

I will admit that is a fair bit of effort to constantly define and declare the different key constraints when creating tables, especially as Developers are focused on efficiency, but it is worth it!

Now, while the following appears to hold true for any foreign key constraint (I haven’t finished testing yet), I found these while testing the above so I’m going to include them here.

SQL Server loves primary key and foreign key constraints.

A primary key gets a unique index created on it to enforce that it is unique and, since it has an index placed upon it, it can be used to speed up query selection.

A foreign key is special though as it forces a constraint and the query optimiser can use these constraints to take certain shortcuts 🙂

Query Optimizer (QO) Examples

-- Join our tables
SELECT F.* FROM dbo.Foo AS [F] JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
SELECT B.* FROM dbo.Foo AS [F] JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;

PrimaryForeignKey_InnerJoin

Thanks to our constraint the QO knows that if something were to exist in Bar, it must be in Foo and, since we are not selecting or ordering anything from Foo, it straight up ignores it.

Less reads, less IO; in general all around better performance.

Does this work with other joins though?
Like above, with for something to exist in Bar it must exist in Foo, see if you can figure out why the QO figures it is safe to ignore some joins.

-- L.Join
SELECT F.* FROM dbo.Foo AS [F] LEFT JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
SELECT B.* FROM dbo.Foo AS [F] LEFT JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
PrimaryForeignKey_LeftJoin
Is the left join to Bar needed here?
-- R.Join.
SELECT F.* FROM dbo.Foo AS [F] RIGHT JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
SELECT B.* FROM dbo.Foo AS [F] RIGHT JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;

 

PrimaryForeignKey_RightJoin.PNG
This is basically the reverse of Left Join
-- F.Join
SELECT F.* FROM dbo.Foo AS [F] FULL OUTER JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
SELECT B.* FROM dbo.Foo AS [F] FULL OUTER JOIN dbo.Bar AS [B] ON F.FooID = B.BarID;
PrimaryForeignKey_FullJoin
Is this surprising?

The “SET operators” (UNION, UNION ALL, INTERSECT, EXCEPT) act a bit differently.

I’ll let you take a look at them yourself though.

Final Note

There is a lot that I have yet to learn about SQL Server, in fact that is the main reason that I created this blog; so I could read back on these posts sometime in the future and smile at my ignorance.

Hopefully the main aspect that I take from this post though is that it is okay not to know something as long as you have the desire and the initiative to learn.

Oh, and a Primary Key can be a Foreigh Key too. 🙂

Temporary Tables Naming Constraints

Introduction:

Kenneth Fisher (b | t)  recently wrote about Re-Evaluating Best Practices and, reading his post,  I couldn’t help but agree with him. Especially with regard to:

Times change, knowledge changes so best practices have to change. Don’t rest on your knowledge or the knowledge of others. Make sure you understand not only what the best practice is but why it’s the best practice. And then question it.

Now I’m not going to bring up the Microsoft PLE of 300 advice as that example has been taken out and waved in front of people’s faces so many times that I feel it’s lost it’s impact and, as far as I am aware, it’s the only case where the best practice is so widely ineffectual.

However, the statement…

Make sure you understand not only what the best practice is but why it’s the best practice.

… is, for me, the crucial statement in his post and the catalyst for the following post as I’ve fallen for a case where the best practices are not applicable; Naming Constraints.


Naming Constraints:

In this post, we are going to be looking at the best practice of giving logical, descriptive names to constraints in tables.

The following code is going to create a table called dbo.NamingConstraints with an Primary key column, a named constraint column and an unnamed constraint column.

Create dbo.NamingConstraints:

CREATE TABLE dbo.NamingConstraints
(
ID int IDENTITY(1, 1) CONSTRAINT [PK_NamingConstraint_ID] PRIMARY KEY,
NamedConstraint int CONSTRAINT [NamedConstraint_gt_0] CHECK (NamedConstraint > 0),
UnNamedConstraint varchar(50) CHECK (UnNamedConstraint <> 'Forbidden')
);
GO

We can check these constraints with the following two queries, the first for the  Primary key, and the second for the CHECK constraints, with the results in Figure 1.1.

Constraint Check:

-- Primary Key:
SELECT name, is_system_named, type_desc, unique_index_id
FROM sys.key_constraints
WHERE [parent_object_id] = OBJECT_ID('dbo.NamingConstraints');
-- Check Constraints:
SELECT name, is_system_named, type_desc, is_disabled, [definition]
FROM sys.check_constraints
WHERE [parent_object_id] = OBJECT_ID('dbo.NamingConstraints');
GO
NamingConstraintsTable_Constraints
Figure 1.1

As Figure 1.1 shows us when we don’t specify a name for a constraint, SQL Server will assign a name to that constraint for us.


Why Naming Constraints is Best Practice.

Constraints are best used to ensure referential and data integrity. Therefore they are commonly seen when data considered against business logic is attempted to be inserted into the database, and error messages are thrown.

When these error messages occur, they normally are passed into error logs from whatever application is integreated into our database. In these cases it is easier to deal with descriptive names than non descriptive ones.

Taking our two CHECK constraints as examples, if we were to introduce error messages…

Create constraint errors:

-- UnNamed Constraint Violated;
INSERT INTO dbo.NamingConstraints
(NamedConstraint, UnNamedConstraint)
VALUES (1, 'Forbidden');

-- Named Constraint Violated;
INSERT INTO dbo.NamingConstraints
(NamedConstraint, UnNamedConstraint)
VALUES (-1, 'Allowed');

Looking at the first error message reported (Figure 2.1), we know from the error message that something is wrong in the Table dbo.NamingConstraints and the column is UnNamedConstraint but that is it. If this table had multiple constraints, and we weren’t the one to create this table and the constraints, we would have to do some (potentially lengthy) investigation to figure out what is causing the error and then correct it.

NamingConstraintsTable_UnNamedViolation
Figure 2.1

Compare that with the error message for our named constraint (Figure 2.2).

NamingConstraintsTable_NamedViolation
Figure 2.2

As we have a proper, descriptive constraint name here, straight away we can say that the error occurred as we tried to insert a value that was not greater than 0.


When Naming Constraints is not applicable.

TL;DR

Do not name constraints on temporary tables.

Why?

Why as in what would a use case for this be? I use this a lot to step through code with different variables, especially with stored procedures.

Two windows, side by side, running them step by step and comparing the results in each.

I know, fun right?…

Or why as in why should you not name constraints on temporary tables?
Well that’s going to require a bit more detail.

SQL Server requires a unique name on it’s objects as they must comply with the rules of identifiers.

So if we were troubleshooting a procedure and attempted to pass results into a temporary table…

CREATE TABLE #Temp02
( 
Col1 int IDENTITY(1,1) NOT NULL,
Col2 varchar(256) CONSTRAINT [Col2_neq_Forbidden] CHECK (Col2 <> 'Fobidden')
);

INSERT INTO #Temp02 (Col2)
SELECT name FROM sys.all_objects;

SELECT * FROM #Temp02;
GO

 

… we should have no problem.

TempTableCreationNoErrors
Figure 3.1

But say we try to do that in two seperate windows…

SideBySideTempTableCreationErrors
Figure 3.2

… Big, angry error message telling us that it could not create the constraint and that an object alreadt exists in the database.

Now say that we were to not explicitly name the constraints on these tables?

CREATE TABLE #Temp02
( 
Col1 int IDENTITY(1,1) NOT NULL,
Col2 varchar(256)
);

INSERT INTO #Temp02 (Col2)
SELECT name FROM sys.all_objects;

SELECT * FROM #Temp02;
GO
NamingConstraints_NoConstraints
Figure 3.3

No problem! Since we have not explicitly named the constraint, SQL Server does not violate it’s rule for identifiers and so does not throw an error message!


Caveats

Yes, I know that this could be classed as an extreme edge case but that is not the caveat that I’m talking about here.

If you are in the habit of not skipping over reading the actual SQL code, you may be wondering to yourself.

Well, the temp tables are called the same name and they should follow the rules for identifiers as well, no? Why aren’t they throwing an error?

Well that’s because temp tables are treated a bit differently than other objects.

Consider the following example where we try to find our temp table in TempDB:

SELECT * FROM tempdb.sys.tables WHERE name = '#Temp02';
NamingConstraints_NoResults_EqTempTable
Figure 4.1

Nothing. It doesn’t exist. But we didn’t drop it and we haven’t closed the scope so it can’t have just disappeared!

If we change our select statement to the LIKE operator with an ending %…

 

SELECT * FROM tempdb.sys.tables WHERE name LIKE '#Temp02%';
NamingConstraints_Results_LikeTempTable
Figure 4.2

SQL Server, knowing that temp tables could get created multiple times concurrently (especially if created in Stored Procedures),  gets around the rule for identifiers with temp tables by adding a unique suffix onto each temp table that is created.

Therefore, it doesn’t violate the rule, it doesn’t error out and multiple concurrent sme-named temp tables can be created.

Why doesn’t this unique suffix happen with constraints aswell? Is this on purpose? By  Design?
Well the only answer I can give is, I don’t know.

But what I do know is that, in these cases with temp, don’t name your constraints.