From: Tony Rogerson on
Smoke and mirrors --CELKO--

It's well documented the nested sets model does not scale on SQL Server.

You also mention you do not need to use UPDATE to remove sub-trees yet in
your example you use errr.. UPDATE!

The other problem with your implementation of nested sets is that there is
no history - you have to copy the entire data to form history - for each
change!!! Horrendous.

--ROGGIE--

"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:86d96022-b12c-47d2-85f7-195513f026a1(a)u9g2000prm.googlegroups.com...
>>> The nested sets model does not scale compared to other solutions.<<
>
> Next month I will be consulting in Chicago for a company that makes a
> product which extracts data from SAP into SQL Server for reporting.
> The emails with the negations specifically thanked me for the Nested
> Sets model. Several years ago, I re-designed a portal control package
> with the Nested Sets model. The schema was reduced from about sixty
> tables to six. We tested it on tree with seven levels and about 100K
> nodes. The performance improvement 2-3 orders of magnitude. In
> fairness, the old code had gotten a bit out of hand. People had added
> data integrity checking in triggers and code and we were able to do it
> in the DDL instead.
>
>>> To remove nodes not only do you have to delete the node rows but you
>>> also have to update the left/right columns to "fix" the order that the
>>> nested sets model relies on (hardly relational!). <<
>
> No, you don't have to updated (lft, rgt) values if all you want to use
> is the containment property. If you want to use the "algebraic"
> property to get the size of subtrees, then you close gaps and that is
> easily done.
>
> The nice part is that the tree structure is in a table with very short
> rows (two integers and the foreign key to the nodes are, say, 16 to 20
> bytes per row), so a lot of them fit into a data page. Use a
> clustered index and most operations use a simple table scan that
> starts at lft value and stops at the matching rgt.
>
> But why be abstract? Bill, here is the code:
>
> /* remove subtree rooted at given node */
> CREATE PROCEDURE DeleteSubtree (@my_node_id <type>)
> AS
> BEGIN
> -- remove subtree
> DELETE FROM NestTree
> WHERE node_id
> IN (SELECT Subtree.node_id
> FROM NestTree AS Root,
> NestTree AS Subtree
> WHERE Root.node_id = @my_node_id
> AND Root.lft BETWEEN Subtree.lft
> AND Subtree.rgt);
> -- close gaps option
> UPDATE NestTree
> SET lft = (SELECT COUNT(*)
> FROM LftRgt.seq_nbr
> WHERE seq_nbr <= lft),
> rgt = (SELECT COUNT(*)
> FROM LftRgt.seq_nbr
> WHERE seq_nbr <= rgt);
> END;
>
> I am assuming you have a view defined this way, but these days, you
> might use a CTE on the UPDATE statement instead.
>
> CREATE VIEW LftRgt (seq_nbr)
> AS SELECT lft FROM NestTree
> UNION ALL
> SELECT rgt FROM NestTree;
>
> I left a WHERE clause off of the UPDATE statement so that the entire
> tree will be compressed. If you are sure that this procedure is
> working on a tree without any prior gaps in it, then add it and get
> another boost in performance.
>
>>> Concurrency problems are abound with a nested sets implementation that
>>> has even a slightly volatile hierarchy. <<
>
> One of the projects I know of that was done with a Nested Sets model
> was a newsgroup, like this one. They like the performance since it was
> better than their previous Adjacency List model and made no mention of
> concurrency problems.
>
> Since subtrees are contiguous in physical storage with a clustered
> index, page locking works fairly well. The trick is to use a higher
> percentage of free space when you have a volatile hierarchy. But that
> applies to other models as well.