Check constraint with a scalar UDF not working [Programming SQL]

Prev: T-SQL cursor - Variables in an email body
Next: sysobject

From: --CELKO-- on 14 Apr 2010 16:36

So order status is NOT computed!

>> The primary business rule I am hoping to enforce is: inserts and updates to the Order_Details table should not occur if the Order has already shipped. <<

That one is easy; use an updatable VIEW and restrict access to the
base table with DCL. All updates and inserts will be forced thru this
view.

CREATE VIEW Working_Orders (order_nbr, something_title, ..)
AS
SELECT order_nbr, something_title, ..
FROM Orders
WHERE order_status <> 'shipped';

I am not sure how you will want to use the WITH CHECK OPTION on this
view. It would prevent an order status change via the Working_Orders
view and force you to go to the base Orders table.

>> The secondary business rule I am hoping to enforce is: items in the Order_Details table associated with an Order that has already shipped can be re-associated with an Order that is in progress. <<
Isn't that a contradiction? I have shipped order #1, so all of its
items left the warehouse. The customer has them, not me. How can I
put them on order #2? I can see if I have both orders #1 and #2 'in
progress' and I want to re-group the items, so I can complete one of
the orders.

From: 20060524 on 14 Apr 2010 17:26

I appreciate your response, Gert-Jan!

The UDF implements the two business rules, so the following DML sequence is
possible:

declare @identity as int

insert Orders
select 'Title1'
, 'InProgress'

set @identity = @@identity

insert Order_Details
select 'Title'
, @identity

update Orders
set Order_Status = 'Shipped'

insert Orders
select 'Title2'
, 'InProgress'

set @identity = @@identity

update Order_Details
set Orders_ID = @identity

Crazy? Maybe, but it does work in the absence of assertions :)

Humorously enough, I'm trying to solve this issue for a system which poorly
enforced these business rules using stored procedures, which is why I'm going
back to basic concepts such as constraints.

Thanks for your input!

From: 20060524 on 14 Apr 2010 18:31

I appreciate your response, Joe!

I may not have explained it very well, but I'm not concerned about inserts
and updates to the Orders table. Rather, I'm concerned about controlling
inserts and updates to the Order_Details table depending on the status of the
order it is associated with in the Orders table.

I would like to:

1. Allow inserts and updates to the Order_Details table as long as order has
not shipped.
2. Reject inserts and updates to the Order_Details table if the order has
shipped.
3. Allow a row in the Order_Details table to be associated with a different
order that has not shipped (then we're back to step 1).
4. Repeat ad nauseam.

As I wrote in response to Gert-Jan, the UDF allows for the above usage in
the absence of assertions.

Maybe I should not have used the Orders and Order_Details entities for my
analogy, because this would indeed be a contradiction if Order_Details were
physical items being shipped between a warehouse and customer.

I should probably start a new thread, but let me forego the analogy and
describe the business problem I am trying to solve.

I have a temporal database implementation consisting of:

1. A normalized set of working tables which enforce data integrity for DML
operations.
2. A log table for each working table that is populated by a trigger on the
working table for inserts, updates, and deletes.
3. A Data_Dates table which allows a user to specify a point in time.
4. A view for each log table that returns the data based on the point in
time specified by the user.

The process for creating the log tables, triggers, and views are automated,
so this implementation can be applied to any database I wish to add a
temporal dimension to.

What I'm trying to do is add a centralized approval mechanism, by creating a
Changes table to store proposed changes.

All the working tables would then be foreign-keyed to the Changes table so
that DML operations on the working tables always require the entry of an
associated row in the Changes table.

I would like to:

1. Allow inserts and updates to the working tables as long as the associated
row in the Changes table has a Status = 'Draft'.
2. Reject inserts and updates to the working tables if the associated row in
the Changes table has a Status = 'Submitted for Approval'.
3. Allow rows in the working tables to be associated with a different row in
the Changes table with a Status ='Draft if the row it is currently associated
with has a Status = 'Approved' (then we're back to step 1).
4. Repeat ad nauseam.

As of this moment, the UDF approach satisfies all of the above and is
something that can be automated along with the creation of the log tables,
triggers, and views, all the while being generally agnostic of the working
tables.

I would be more than happy to revisit this when SQL Server supports
assertions, or if anyone can provide a better solution given my specific
needs.

Thanks for your input!

From: --CELKO-- on 15 Apr 2010 00:48

This is getting elaborate!

A) We can still do this with updatable VIEWs on the order details

CREATE VIEW Working_Order_Details (order_nbr, item_nbr, ..)
AS
SELECT D.order_nbr, D.item_nbr, ..
FROM Order_Details AS D
WHERE NOT EXISTS
(SELECT *
FROM Orders AS O
WHERE D.order_nbr = O.order_nbr
AND O.order_status = 'shipped')
WITH CHECK OPTION;

1. Allow inserts and updates to the Order_Details table as long as
order has not shipped.

You never see shipped orders in this view, so this is done.

2. Reject inserts and updates to the Order_Details table if the order
has shipped.

You never see shipped orders in this view, so this is done.

3. Allow a row in the Order_Details table to be associated with a
different order that has not shipped.

CREATE PROCEDURE MoveItem
(@source_order_nbr INTEGER, @source_order_nbr INTEGER,
@dest_order_nbr INTEGER)
AS
UPDATE Working_Order_Details
SET order_nbr = @dest_order_nbr
WHERE order_nbr = @source_order_nbr
AND item_number = @source_order_nbr;

2) Revised version -- probably a new thread is a good idea!

Let me use a history table for price changes. The fact is that a
price had duration. This is the nature of time and other continuum.
So a basic history table looks like this in SQL/PSM

CREATE TABLE PriceHistory
(upc CHAR(13) NOT NULL -- industry standard
REFERENCES Inventory(upc),
price_prev_date DATE NOT NULL,
price_start_date DATE DEFAULT CURRENT_DATE NOT NULL,
price_end_date DATE, -- null means current price
CHECK(price_start_date < price_end_date),
CHECK (price_start_date = price_prev_date + INTERVAL 1 DAY), --
prevents gaps
PRIMARY KEY (upc, price_start_date),
item_price DECIMAL (12,4) NOT NULL
CHECK (item_price > 0.0000),
etc.);

You use a BETWEEN predicate to get the appropriate price. You can
enforce the "one null per item" with a trigger but techically this
should work:

CHECK (COUNT(*) OVER (PARTITION BY upc)
= COUNT(price_end_date) OVER (PARTITION BY upc) +1)

SELECT ..
FROM PriceHistory AS H, Orders AS O
WHERE O.sales_date BETWEEN H.price_start_date
AND COALESCE (price_end_date, CURRENT_DATE);

It is also a good idea to have a VIEW with the current data:

CREATE VIEW CurrentPrices (..)
AS
SELECT ..
FROM PriceHistory
WHERE price_end_date IS NULL;

Now your only problem is to write a stored procedure that will update
the table and insert a new row. You can do this with a single MERGE
statement, or with a short block of SQL/PSM code:

CREATE PROCEDURE UpdateItemPrice
(IN in_upc CHAR(13), IN new_item_price DECIMAL (12,4))
LANGUAGE SQL
BEGIN ATOMIC
UPDATE PriceHistory
SET price_end_date = CURRENT_DATE
WHERE upc = in_upc;
INSERT INTO PriceHistory (upc, price_prev_date, price_start_date,
price_end_date, item_price)
VALUES (in_upc, CURRENT_DATE, CURRENT_DATE + INTERVAL '1' DAY, NULL,
new_item_price);
END;

This will make the price change go into effect tomorrow.

There is a common kludge to repair a failure to design a history table
properly that you can put in a VIEW if you are not able to set things
right. Assume that every day we take a short inventory and put it in
a journal. The journal is a clip board paper form that has one line
per item per day, perhaps with gaps in the data. We want to get this
into the proper format, namely periods shown with a (start_date,
end_date) pair for durations where each item had the same quantity on
hand. This is due to Alejandro Mesa

CREATE TABLE InventoryJournal
(journal_date DATETIME NOT NULL,
item_id CHAR(2) NOT NULL,
PRIMARY KEY (journal_date, item_id),
onhand_qty INTEGER NOT NULL);

WITH ItemGroups
AS
(SELECT journal_date, item_id, onhand_qty,
ROW_NUMBER() OVER(ORDER BY item_id, journal_date, onhand_qty)
- ROW_NUMBER() OVER(PARTITION BY item_id, onhand_qty
ORDER BY journal_date) AS item_grp_nbr
FROM Journal),

QtyByDateRanges
AS
(SELECT MIN(journal_date) AS start_date,
MAX(journal_date) AS end_date,
item_id, onhand_qty
FROM ItemGroups
GROUP BY item_id, onhand_qty, item_grp_nbr)

SELECT start_date, end_date, item_id, onhand_qty
FROM QtyByDateRanges;

This might be easier to see with some data and intermediate steps

INSERT INTO InventoryJournal
VALUES('2007-01-01', 'AA', 100),('2007-01-01', 'BB', 200),
('2007-01-02', 'AA', 100),('2007-01-02', 'BB', 200),
('2007-01-03', 'AA', 100),('2007-01-03', 'BB', 300);

start_date end_date item_id onhand_qty
==========================================
'2007-01-01' '2007-01-03' 'AA' 100
'2007-01-01' '2007-01-02' 'BB' 200
'2007-01-03' '2007-01-03' 'BB' 300

Now, download the Rick Snodgrass book on Temporal Queries in SQL from
the University of Arizona website (it is free). Tom Johnston's book
on temporal data in SQL just came out and you might want a copy.

From: Tony Rogerson on 15 Apr 2010 01:05

> and so forth before we worry about other fixes. You do know that
> IDENTITY is not ever a key by definition, that there is no such

In this is instance it is being used as an artificial key because there
isn't a real key - the term relational key is wrong in the way you are using
it - a surrogate, artificial or natural key are all relational keys in the
eyes of the model.

It is fine when used as a surrogate key - which we all know you completely
don't get what the definition of a surrogate key is, so here let me
elaborate for you:

From my blog:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2010/03/13/database-design-surrogate-keys-part-1-of-many-rules-for-surrogate-keys-e-f-codd-and-c-j-date-research-and-problems-they-solve.aspx

Surrogate keys
There are two factions in the world of Database Design that enact wars
against the use of surrogate keys, there is a lot of misunderstanding around
what a surrogate key actually is and the need for them.

This article intends to draw a line under that battle and defend once and
for all the use of surrogate keys, to do that we need to look at what
problems they solve, the theory and rationale behind them and how to
implement them in your chosen product.

I'm going to cover a lot:

•Rules for a Surrogate key
•Problems they solve
•The "Theory" - we will look at the research done primarily by E. F. Codd
and C. J. Date but will also look at other authors and their take on the
subject.
•How to implement a surrogate key
◦o Database Design (includes a discussion on Normalisation and when you'd
introduce a surrogate)
◦o Techniques for generating the surrogate key value (IDENTITY, NEWID(),
NEWSEQUENTIALID, Hashing and do it yourself)
◦o Considerations around Referential Integrity
◦o Considerations around External Verification of the surrogate key value
and why even the notion is invalid
◦o Locking Considerations - effect on concurrency
•Example Tricks using the surrogate key for example Table Partitioning
Rules for a Surrogate Key
1.They must never be related (directly or indirectly) to the physical
hardware or storage - to use the Relational phrase - they are not tuple-id's
(row/record locators).
2.Once a value has been used - it must never be reused by something else -
basically once the value "5" has been used, then the surrogate key generator
must never generate the number "5" again even if the original row has been
deleted.
3.They are read only - once generated that value must never be changed.
4.The value must be atomic, that means no sets - just a constant for
instance "5".
5.The value should never be exposed outside the limits of the application as
a whole (the database is not a black box) - I explain more later but
basically the User of your application for instance the Call Centre staff
entering client details, the external machine calling your web service
should never see the surrogate key value (they should see one of the
candidate key values instead) - that is because the surrogate key cannot be
used for verification purposes but the candidate key can.
6.A surrogate key can be the sole key for a table in a situation where there
are no natural occurring candidate keys for example a Message Board. When
used in this scenario arguably it's no longer a surrogate key but an
artificial one but I would still follow rules 1 – 4. Having said that you
need to consider verification for instance if the rows were re-inserted and
the artificial key re-generated then the same rows (entities) should get the
same artificial key value; there are hashing solutions to provide that
facility but the obvious problem of duplicate rows come into play – that
discussion is outside the scope of surrogate keys.
The problem they solve
Surrogate keys form an abstraction from the candidate keys on a table, it
can be argued that creating a surrogate key on a table where a single column
candidate key exists makes no sense and that their real use is for making
composite keys simpler to use throughout the application - a fair point but
there is more to consider.

I'd say the real benefit of a surrogate key is in the abstraction it
provides thus removing concurrency problems caused should the candidate key
values you've used in foreign key clauses in related tables change - a
surrogate key value once generated can never change and its value can never
be reused even if the rows have been deleted (Codd and Date - see later).

I've summarised below some of the main points I see surrogates provide us:

•Joins between tables are done on a single expression unlike say a join
using a composite key for instance instead of writing: ON
Manufacturers.Maker = Registrations.Maker AND Manufacturers.Model =
Registrations.Model you would write ON Manufacturers.ManufacturerID =
Registrations.ManufacturerID.
•Reduce concurrency problems caused when values of existing candidate key
being used in foreign key references change.
•Removes any possibility of inconsistencies in data type, length and
collation creeping into the schema, for instance in the Manufacturers table
the column Maker is defined as varchar(250) with a collation of Latin CI, in
the Registrations table the column Maker is defined varchar(200) with a
collation of Spanish.
•Join performance, this is really debatable; with today's hardware and
improved code efficiency within the Optimiser and Storage Engine I don't
think this really matters (that much) anymore.
•The surrogate key can provide a method for partitioning of your data - I'll
show that later with an example of using partitioned views (as opposed to
the Enterprise Edition feature Partitioning). The effect of partitioning
your data helps reduce locking contention both within the storage engine
(allocating new extents) and for ACID in terms of consistency - less
blocking!
The Theory
E. F. Codd and C. J. Date are seen as the parents of the Relational Model,
their initial and continued research are the underpinnings of Database
Design and why Relational Databases exist.

Codd on Surrogate Keys
In the paper ACM Transactions on Database Systems, Vol. 4, No. 4, December
1979 pages 409 - 410 Codd introduces the concept of a surrogate key, the key
points of his text are as follows:

{snip}

"There are three difficulties in employing user-controlled keys as permanent
surrogates for entities.

(1) The actual values of user-controlled keys are determined by users and
must therefore be subject to change by them (e.g., if two companies merge,
the two employee databases might be combined with the result that some or
all of the serial numbers might be changed).

(2) Two relations may have user-controlled keys defined on distinct domains
(e.g., one uses social security, while the other uses employee serial
number) and yet the entities denoted are the same.

(3) It may be necessary to carry information about an entity either before
it has been assigned a user-controlled key value or after it has ceased to
have one (e.g., an applicant for a job and a retiree).

These difficulties have the important consequence that an equi-join on
common key values may not yield the same result as a join on common
entities. A solution-proposed in part in [4] and more fully in [14]-is to
introduce entity domains which contain system-assigned surrogates. Database
users may cause the system to generate or delete a surrogate, but they have
no control over its value, nor is its value ever displayed to them.

{snip}

Users will often need entity identifiers (such as part serial numbers) that
are totally under their control, although they are no longer compelled to
invent a user-controlled key if they do not wish to.

They will have to remember, however, that it is now the surrogate that is
the primary key and provides truly permanent identification of each entity.
The capability of making equi-joins on surrogates implies that users see the
headings of such columns but not the specific values in those columns."

What Codd means by "User of the database" is open to debate in respect of
whether the surrogate value should be displayed to them, C J Date makes the
point that if the value is not displayed it breaks Codd's own Information
Principal. I believe the User of the database is not the Developer or
Database Administrators - they are not users of the Database - they build
systems that have users a user being a person using the application or a
machine using a web service for instance.

His point about "The capability of making equi-joins on surrogates implies
that users see the headings of such columns" means that we do see the
columns - we'd have to in order to join our tables together! Ok, some
products do have this ability - they internally have their own surrogate
keys that are not exposed and are used in joins under the covers, but that
is not a requirement by Codd just a nice feature of those products - in SQL
Server land we unfortunately do not have that feature.

Date on Surrogate Keys
From C. J. Date's book "An Introduction to Database Systems, C J Date, 8th
Edition":

Page 434, reference 14.21 - he refers to P. Hall, J. Owlett and S J P Todd
"Relations and Entities" and says:

"Surrogate Keys are keys in the usual Relational sense but have the
following specific properties:

They always involve exactly one attribute.

Their values serve solely as surrogate (hence the name) for the entities
they stand for ......

When a new entity is inserted into the database, it is given a surrogate key
value that has never been used before and will never be used again, even if
the entity in question is subsequently deleted.

Ideally surrogate keys value would be system-generated.

Surrogates must not be concealed from the user because of the Information
Principal - tuple ID's should be however."

He finishes:

"In a nutshell: Surrogates are a model concept; tuple IDs are an
implementation concept"

C J Date makes the distinction between a tuple ID (basically a row ID) and a
surrogate key; the row ID changes as rows get removed from the table - that
should never be exposed from the database even for developers or Database
Administrators, the SQL language has the ROW_NUMBER() function to provide us
with a numbering system for a result.

"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:f41a7dd9-babe-478b-ae44-1c99e2c4944e(a)u21g2000yqc.googlegroups.com...
> We might want to make the schema relational, follow ISO-11179 rules
> and so forth before we worry about other fixes. You do know that
> IDENTITY is not ever a key by definition, that there is no such
> creature as a magical universal “id” in RDBMS, etc.? Did you really
> need to use NCHAR() for Chinese or other non-Latin codes? The data
> element orders_id should be singular. Etc.
>
> The standard design pattern for orders and their details is like this:
>
> CREATE TABLE Orders
> (order_nbr INTEGER NOT NULL PRIMARY KEY,
> something_title NCHAR(10) NOT NULL,
> order_status CHAR(10) DEFAULT 'in progress' NOT NULL
> CHECK (order_status IN ('in progress', 'shipped'))
> );
>
>
> CREATE TABLE Order_Details
> (order_nbr INTEGER NOT NULL
> REFERENCES Orders(order_nbr)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> item_nbr INTEGER NOT NULL
> REFERENCES Inventory(item_nbr)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> PRIMARY KEY (order_nbr, item_nbr)
> item_status CHAR(10) DEFAULT 'in progress' NOT NULL
> CHECK (item_status IN ('in progress', 'shipped'))
> ..);
>
> Notice the use of a **REAL** relational key instead of a fake pointer
> chain, the use of DRI, etc. SQL Server has supported Standard syntax
> for years, so you should have written:
>
> INSERT INTO Orders (..) VALUES (..);
>
> thus avoiding dialect. Your whole approach is procedural, so you did
> not think of a declarative design and immediately jumped to
> proprietary, procedural UDFs.
>
> Can you give a clear statement of what business rule you want to
> enforce? Maybe something like: An order as a whole is 'in progress'
> if any item on order is 'in progress' or something else?
>

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: T-SQL cursor - Variables in an email body
Next: sysobject