From: Tony Rogerson on 22 Apr 2010 18:20 Smoke and mirrors --CELKO-- It's well documented the nested sets model does not scale on SQL Server. You also mention you do not need to use UPDATE to remove sub-trees yet in your example you use errr.. UPDATE! The other problem with your implementation of nested sets is that there is no history - you have to copy the entire data to form history - for each change!!! Horrendous. --ROGGIE-- "--CELKO--" <jcelko212(a)earthlink.net> wrote in message news:86d96022-b12c-47d2-85f7-195513f026a1(a)u9g2000prm.googlegroups.com... >>> The nested sets model does not scale compared to other solutions.<< > > Next month I will be consulting in Chicago for a company that makes a > product which extracts data from SAP into SQL Server for reporting. > The emails with the negations specifically thanked me for the Nested > Sets model. Several years ago, I re-designed a portal control package > with the Nested Sets model. The schema was reduced from about sixty > tables to six. We tested it on tree with seven levels and about 100K > nodes. The performance improvement 2-3 orders of magnitude. In > fairness, the old code had gotten a bit out of hand. People had added > data integrity checking in triggers and code and we were able to do it > in the DDL instead. > >>> To remove nodes not only do you have to delete the node rows but you >>> also have to update the left/right columns to "fix" the order that the >>> nested sets model relies on (hardly relational!). << > > No, you don't have to updated (lft, rgt) values if all you want to use > is the containment property. If you want to use the "algebraic" > property to get the size of subtrees, then you close gaps and that is > easily done. > > The nice part is that the tree structure is in a table with very short > rows (two integers and the foreign key to the nodes are, say, 16 to 20 > bytes per row), so a lot of them fit into a data page. Use a > clustered index and most operations use a simple table scan that > starts at lft value and stops at the matching rgt. > > But why be abstract? Bill, here is the code: > > /* remove subtree rooted at given node */ > CREATE PROCEDURE DeleteSubtree (@my_node_id <type>) > AS > BEGIN > -- remove subtree > DELETE FROM NestTree > WHERE node_id > IN (SELECT Subtree.node_id > FROM NestTree AS Root, > NestTree AS Subtree > WHERE Root.node_id = @my_node_id > AND Root.lft BETWEEN Subtree.lft > AND Subtree.rgt); > -- close gaps option > UPDATE NestTree > SET lft = (SELECT COUNT(*) > FROM LftRgt.seq_nbr > WHERE seq_nbr <= lft), > rgt = (SELECT COUNT(*) > FROM LftRgt.seq_nbr > WHERE seq_nbr <= rgt); > END; > > I am assuming you have a view defined this way, but these days, you > might use a CTE on the UPDATE statement instead. > > CREATE VIEW LftRgt (seq_nbr) > AS SELECT lft FROM NestTree > UNION ALL > SELECT rgt FROM NestTree; > > I left a WHERE clause off of the UPDATE statement so that the entire > tree will be compressed. If you are sure that this procedure is > working on a tree without any prior gaps in it, then add it and get > another boost in performance. > >>> Concurrency problems are abound with a nested sets implementation that >>> has even a slightly volatile hierarchy. << > > One of the projects I know of that was done with a Nested Sets model > was a newsgroup, like this one. They like the performance since it was > better than their previous Adjacency List model and made no mention of > concurrency problems. > > Since subtrees are contiguous in physical storage with a clustered > index, page locking works fairly well. The trick is to use a higher > percentage of free space when you have a volatile hierarchy. But that > applies to other models as well. |