From: --CELKO-- on 14 Apr 2010 16:36 So order status is NOT computed! >> The primary business rule I am hoping to enforce is: inserts and updates to the Order_Details table should not occur if the Order has already shipped. << That one is easy; use an updatable VIEW and restrict access to the base table with DCL. All updates and inserts will be forced thru this view. CREATE VIEW Working_Orders (order_nbr, something_title, ..) AS SELECT order_nbr, something_title, .. FROM Orders WHERE order_status <> 'shipped'; I am not sure how you will want to use the WITH CHECK OPTION on this view. It would prevent an order status change via the Working_Orders view and force you to go to the base Orders table. >> The secondary business rule I am hoping to enforce is: items in the Order_Details table associated with an Order that has already shipped can be re-associated with an Order that is in progress. << Isn't that a contradiction? I have shipped order #1, so all of its items left the warehouse. The customer has them, not me. How can I put them on order #2? I can see if I have both orders #1 and #2 'in progress' and I want to re-group the items, so I can complete one of the orders.
From: 20060524 on 14 Apr 2010 17:26 I appreciate your response, Gert-Jan! The UDF implements the two business rules, so the following DML sequence is possible: declare @identity as int insert Orders select 'Title1' , 'InProgress' set @identity = @@identity insert Order_Details select 'Title' , @identity update Orders set Order_Status = 'Shipped' insert Orders select 'Title2' , 'InProgress' set @identity = @@identity update Order_Details set Orders_ID = @identity Crazy? Maybe, but it does work in the absence of assertions :) Humorously enough, I'm trying to solve this issue for a system which poorly enforced these business rules using stored procedures, which is why I'm going back to basic concepts such as constraints. Thanks for your input!
From: 20060524 on 14 Apr 2010 18:31 I appreciate your response, Joe! I may not have explained it very well, but I'm not concerned about inserts and updates to the Orders table. Rather, I'm concerned about controlling inserts and updates to the Order_Details table depending on the status of the order it is associated with in the Orders table. I would like to: 1. Allow inserts and updates to the Order_Details table as long as order has not shipped. 2. Reject inserts and updates to the Order_Details table if the order has shipped. 3. Allow a row in the Order_Details table to be associated with a different order that has not shipped (then we're back to step 1). 4. Repeat ad nauseam. As I wrote in response to Gert-Jan, the UDF allows for the above usage in the absence of assertions. Maybe I should not have used the Orders and Order_Details entities for my analogy, because this would indeed be a contradiction if Order_Details were physical items being shipped between a warehouse and customer. I should probably start a new thread, but let me forego the analogy and describe the business problem I am trying to solve. I have a temporal database implementation consisting of: 1. A normalized set of working tables which enforce data integrity for DML operations. 2. A log table for each working table that is populated by a trigger on the working table for inserts, updates, and deletes. 3. A Data_Dates table which allows a user to specify a point in time. 4. A view for each log table that returns the data based on the point in time specified by the user. The process for creating the log tables, triggers, and views are automated, so this implementation can be applied to any database I wish to add a temporal dimension to. What I'm trying to do is add a centralized approval mechanism, by creating a Changes table to store proposed changes. All the working tables would then be foreign-keyed to the Changes table so that DML operations on the working tables always require the entry of an associated row in the Changes table. I would like to: 1. Allow inserts and updates to the working tables as long as the associated row in the Changes table has a Status = 'Draft'. 2. Reject inserts and updates to the working tables if the associated row in the Changes table has a Status = 'Submitted for Approval'. 3. Allow rows in the working tables to be associated with a different row in the Changes table with a Status ='Draft if the row it is currently associated with has a Status = 'Approved' (then we're back to step 1). 4. Repeat ad nauseam. As of this moment, the UDF approach satisfies all of the above and is something that can be automated along with the creation of the log tables, triggers, and views, all the while being generally agnostic of the working tables. I would be more than happy to revisit this when SQL Server supports assertions, or if anyone can provide a better solution given my specific needs. Thanks for your input!
From: --CELKO-- on 15 Apr 2010 00:48 This is getting elaborate! A) We can still do this with updatable VIEWs on the order details CREATE VIEW Working_Order_Details (order_nbr, item_nbr, ..) AS SELECT D.order_nbr, D.item_nbr, .. FROM Order_Details AS D WHERE NOT EXISTS (SELECT * FROM Orders AS O WHERE D.order_nbr = O.order_nbr AND O.order_status = 'shipped') WITH CHECK OPTION; 1. Allow inserts and updates to the Order_Details table as long as order has not shipped. You never see shipped orders in this view, so this is done. 2. Reject inserts and updates to the Order_Details table if the order has shipped. You never see shipped orders in this view, so this is done. 3. Allow a row in the Order_Details table to be associated with a different order that has not shipped. CREATE PROCEDURE MoveItem (@source_order_nbr INTEGER, @source_order_nbr INTEGER, @dest_order_nbr INTEGER) AS UPDATE Working_Order_Details SET order_nbr = @dest_order_nbr WHERE order_nbr = @source_order_nbr AND item_number = @source_order_nbr; 2) Revised version -- probably a new thread is a good idea! Let me use a history table for price changes. The fact is that a price had duration. This is the nature of time and other continuum. So a basic history table looks like this in SQL/PSM CREATE TABLE PriceHistory (upc CHAR(13) NOT NULL -- industry standard REFERENCES Inventory(upc), price_prev_date DATE NOT NULL, price_start_date DATE DEFAULT CURRENT_DATE NOT NULL, price_end_date DATE, -- null means current price CHECK(price_start_date < price_end_date), CHECK (price_start_date = price_prev_date + INTERVAL 1 DAY), -- prevents gaps PRIMARY KEY (upc, price_start_date), item_price DECIMAL (12,4) NOT NULL CHECK (item_price > 0.0000), etc.); You use a BETWEEN predicate to get the appropriate price. You can enforce the "one null per item" with a trigger but techically this should work: CHECK (COUNT(*) OVER (PARTITION BY upc) = COUNT(price_end_date) OVER (PARTITION BY upc) +1) SELECT .. FROM PriceHistory AS H, Orders AS O WHERE O.sales_date BETWEEN H.price_start_date AND COALESCE (price_end_date, CURRENT_DATE); It is also a good idea to have a VIEW with the current data: CREATE VIEW CurrentPrices (..) AS SELECT .. FROM PriceHistory WHERE price_end_date IS NULL; Now your only problem is to write a stored procedure that will update the table and insert a new row. You can do this with a single MERGE statement, or with a short block of SQL/PSM code: CREATE PROCEDURE UpdateItemPrice (IN in_upc CHAR(13), IN new_item_price DECIMAL (12,4)) LANGUAGE SQL BEGIN ATOMIC UPDATE PriceHistory SET price_end_date = CURRENT_DATE WHERE upc = in_upc; INSERT INTO PriceHistory (upc, price_prev_date, price_start_date, price_end_date, item_price) VALUES (in_upc, CURRENT_DATE, CURRENT_DATE + INTERVAL '1' DAY, NULL, new_item_price); END; This will make the price change go into effect tomorrow. There is a common kludge to repair a failure to design a history table properly that you can put in a VIEW if you are not able to set things right. Assume that every day we take a short inventory and put it in a journal. The journal is a clip board paper form that has one line per item per day, perhaps with gaps in the data. We want to get this into the proper format, namely periods shown with a (start_date, end_date) pair for durations where each item had the same quantity on hand. This is due to Alejandro Mesa CREATE TABLE InventoryJournal (journal_date DATETIME NOT NULL, item_id CHAR(2) NOT NULL, PRIMARY KEY (journal_date, item_id), onhand_qty INTEGER NOT NULL); WITH ItemGroups AS (SELECT journal_date, item_id, onhand_qty, ROW_NUMBER() OVER(ORDER BY item_id, journal_date, onhand_qty) - ROW_NUMBER() OVER(PARTITION BY item_id, onhand_qty ORDER BY journal_date) AS item_grp_nbr FROM Journal), QtyByDateRanges AS (SELECT MIN(journal_date) AS start_date, MAX(journal_date) AS end_date, item_id, onhand_qty FROM ItemGroups GROUP BY item_id, onhand_qty, item_grp_nbr) SELECT start_date, end_date, item_id, onhand_qty FROM QtyByDateRanges; This might be easier to see with some data and intermediate steps INSERT INTO InventoryJournal VALUES('2007-01-01', 'AA', 100),('2007-01-01', 'BB', 200), ('2007-01-02', 'AA', 100),('2007-01-02', 'BB', 200), ('2007-01-03', 'AA', 100),('2007-01-03', 'BB', 300); start_date end_date item_id onhand_qty ========================================== '2007-01-01' '2007-01-03' 'AA' 100 '2007-01-01' '2007-01-02' 'BB' 200 '2007-01-03' '2007-01-03' 'BB' 300 Now, download the Rick Snodgrass book on Temporal Queries in SQL from the University of Arizona website (it is free). Tom Johnston's book on temporal data in SQL just came out and you might want a copy.
From: Tony Rogerson on 15 Apr 2010 01:05
> and so forth before we worry about other fixes. You do know that > IDENTITY is not ever a key by definition, that there is no such In this is instance it is being used as an artificial key because there isn't a real key - the term relational key is wrong in the way you are using it - a surrogate, artificial or natural key are all relational keys in the eyes of the model. It is fine when used as a surrogate key - which we all know you completely don't get what the definition of a surrogate key is, so here let me elaborate for you: From my blog: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2010/03/13/database-design-surrogate-keys-part-1-of-many-rules-for-surrogate-keys-e-f-codd-and-c-j-date-research-and-problems-they-solve.aspx Surrogate keys There are two factions in the world of Database Design that enact wars against the use of surrogate keys, there is a lot of misunderstanding around what a surrogate key actually is and the need for them. This article intends to draw a line under that battle and defend once and for all the use of surrogate keys, to do that we need to look at what problems they solve, the theory and rationale behind them and how to implement them in your chosen product. I'm going to cover a lot: •Rules for a Surrogate key •Problems they solve •The "Theory" - we will look at the research done primarily by E. F. Codd and C. J. Date but will also look at other authors and their take on the subject. •How to implement a surrogate key ◦o Database Design (includes a discussion on Normalisation and when you'd introduce a surrogate) ◦o Techniques for generating the surrogate key value (IDENTITY, NEWID(), NEWSEQUENTIALID, Hashing and do it yourself) ◦o Considerations around Referential Integrity ◦o Considerations around External Verification of the surrogate key value and why even the notion is invalid ◦o Locking Considerations - effect on concurrency •Example Tricks using the surrogate key for example Table Partitioning Rules for a Surrogate Key 1.They must never be related (directly or indirectly) to the physical hardware or storage - to use the Relational phrase - they are not tuple-id's (row/record locators). 2.Once a value has been used - it must never be reused by something else - basically once the value "5" has been used, then the surrogate key generator must never generate the number "5" again even if the original row has been deleted. 3.They are read only - once generated that value must never be changed. 4.The value must be atomic, that means no sets - just a constant for instance "5". 5.The value should never be exposed outside the limits of the application as a whole (the database is not a black box) - I explain more later but basically the User of your application for instance the Call Centre staff entering client details, the external machine calling your web service should never see the surrogate key value (they should see one of the candidate key values instead) - that is because the surrogate key cannot be used for verification purposes but the candidate key can. 6.A surrogate key can be the sole key for a table in a situation where there are no natural occurring candidate keys for example a Message Board. When used in this scenario arguably it's no longer a surrogate key but an artificial one but I would still follow rules 1 – 4. Having said that you need to consider verification for instance if the rows were re-inserted and the artificial key re-generated then the same rows (entities) should get the same artificial key value; there are hashing solutions to provide that facility but the obvious problem of duplicate rows come into play – that discussion is outside the scope of surrogate keys. The problem they solve Surrogate keys form an abstraction from the candidate keys on a table, it can be argued that creating a surrogate key on a table where a single column candidate key exists makes no sense and that their real use is for making composite keys simpler to use throughout the application - a fair point but there is more to consider. I'd say the real benefit of a surrogate key is in the abstraction it provides thus removing concurrency problems caused should the candidate key values you've used in foreign key clauses in related tables change - a surrogate key value once generated can never change and its value can never be reused even if the rows have been deleted (Codd and Date - see later). I've summarised below some of the main points I see surrogates provide us: •Joins between tables are done on a single expression unlike say a join using a composite key for instance instead of writing: ON Manufacturers.Maker = Registrations.Maker AND Manufacturers.Model = Registrations.Model you would write ON Manufacturers.ManufacturerID = Registrations.ManufacturerID. •Reduce concurrency problems caused when values of existing candidate key being used in foreign key references change. •Removes any possibility of inconsistencies in data type, length and collation creeping into the schema, for instance in the Manufacturers table the column Maker is defined as varchar(250) with a collation of Latin CI, in the Registrations table the column Maker is defined varchar(200) with a collation of Spanish. •Join performance, this is really debatable; with today's hardware and improved code efficiency within the Optimiser and Storage Engine I don't think this really matters (that much) anymore. •The surrogate key can provide a method for partitioning of your data - I'll show that later with an example of using partitioned views (as opposed to the Enterprise Edition feature Partitioning). The effect of partitioning your data helps reduce locking contention both within the storage engine (allocating new extents) and for ACID in terms of consistency - less blocking! The Theory E. F. Codd and C. J. Date are seen as the parents of the Relational Model, their initial and continued research are the underpinnings of Database Design and why Relational Databases exist. Codd on Surrogate Keys In the paper ACM Transactions on Database Systems, Vol. 4, No. 4, December 1979 pages 409 - 410 Codd introduces the concept of a surrogate key, the key points of his text are as follows: {snip} "There are three difficulties in employing user-controlled keys as permanent surrogates for entities. (1) The actual values of user-controlled keys are determined by users and must therefore be subject to change by them (e.g., if two companies merge, the two employee databases might be combined with the result that some or all of the serial numbers might be changed). (2) Two relations may have user-controlled keys defined on distinct domains (e.g., one uses social security, while the other uses employee serial number) and yet the entities denoted are the same. (3) It may be necessary to carry information about an entity either before it has been assigned a user-controlled key value or after it has ceased to have one (e.g., an applicant for a job and a retiree). These difficulties have the important consequence that an equi-join on common key values may not yield the same result as a join on common entities. A solution-proposed in part in [4] and more fully in [14]-is to introduce entity domains which contain system-assigned surrogates. Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them. {snip} Users will often need entity identifiers (such as part serial numbers) that are totally under their control, although they are no longer compelled to invent a user-controlled key if they do not wish to. They will have to remember, however, that it is now the surrogate that is the primary key and provides truly permanent identification of each entity. The capability of making equi-joins on surrogates implies that users see the headings of such columns but not the specific values in those columns." What Codd means by "User of the database" is open to debate in respect of whether the surrogate value should be displayed to them, C J Date makes the point that if the value is not displayed it breaks Codd's own Information Principal. I believe the User of the database is not the Developer or Database Administrators - they are not users of the Database - they build systems that have users a user being a person using the application or a machine using a web service for instance. His point about "The capability of making equi-joins on surrogates implies that users see the headings of such columns" means that we do see the columns - we'd have to in order to join our tables together! Ok, some products do have this ability - they internally have their own surrogate keys that are not exposed and are used in joins under the covers, but that is not a requirement by Codd just a nice feature of those products - in SQL Server land we unfortunately do not have that feature. Date on Surrogate Keys From C. J. Date's book "An Introduction to Database Systems, C J Date, 8th Edition": Page 434, reference 14.21 - he refers to P. Hall, J. Owlett and S J P Todd "Relations and Entities" and says: "Surrogate Keys are keys in the usual Relational sense but have the following specific properties: They always involve exactly one attribute. Their values serve solely as surrogate (hence the name) for the entities they stand for ...... When a new entity is inserted into the database, it is given a surrogate key value that has never been used before and will never be used again, even if the entity in question is subsequently deleted. Ideally surrogate keys value would be system-generated. Surrogates must not be concealed from the user because of the Information Principal - tuple ID's should be however." He finishes: "In a nutshell: Surrogates are a model concept; tuple IDs are an implementation concept" C J Date makes the distinction between a tuple ID (basically a row ID) and a surrogate key; the row ID changes as rows get removed from the table - that should never be exposed from the database even for developers or Database Administrators, the SQL language has the ROW_NUMBER() function to provide us with a numbering system for a result. "--CELKO--" <jcelko212(a)earthlink.net> wrote in message news:f41a7dd9-babe-478b-ae44-1c99e2c4944e(a)u21g2000yqc.googlegroups.com... > We might want to make the schema relational, follow ISO-11179 rules > and so forth before we worry about other fixes. You do know that > IDENTITY is not ever a key by definition, that there is no such > creature as a magical universal “id” in RDBMS, etc.? Did you really > need to use NCHAR() for Chinese or other non-Latin codes? The data > element orders_id should be singular. Etc. > > The standard design pattern for orders and their details is like this: > > CREATE TABLE Orders > (order_nbr INTEGER NOT NULL PRIMARY KEY, > something_title NCHAR(10) NOT NULL, > order_status CHAR(10) DEFAULT 'in progress' NOT NULL > CHECK (order_status IN ('in progress', 'shipped')) > ); > > > CREATE TABLE Order_Details > (order_nbr INTEGER NOT NULL > REFERENCES Orders(order_nbr) > ON UPDATE CASCADE > ON DELETE CASCADE, > item_nbr INTEGER NOT NULL > REFERENCES Inventory(item_nbr) > ON UPDATE CASCADE > ON DELETE CASCADE, > PRIMARY KEY (order_nbr, item_nbr) > item_status CHAR(10) DEFAULT 'in progress' NOT NULL > CHECK (item_status IN ('in progress', 'shipped')) > ..); > > Notice the use of a **REAL** relational key instead of a fake pointer > chain, the use of DRI, etc. SQL Server has supported Standard syntax > for years, so you should have written: > > INSERT INTO Orders (..) VALUES (..); > > thus avoiding dialect. Your whole approach is procedural, so you did > not think of a declarative design and immediately jumped to > proprietary, procedural UDFs. > > Can you give a clear statement of what business rule you want to > enforce? Maybe something like: An order as a whole is 'in progress' > if any item on order is 'in progress' or something else? > |