From: Strav on 22 May 2010 18:12 He. I'm somewhat informally reading lectures on algorithm and stumbling upon disjoint sets, I decided to try a quick and most probably dirty implementation in common lisp. (in a matter of getting acquainted to CL) - with union by depth and path compression. Any comment concerned with bettering this implementation is most welcomed but more importantly, a few questions arose when I was thinking about / writing the code, if any of you can shed some light on them, it would be most appreciated. (defstruct node (value nil) (parent nil) (root nil) (rank 0)) (defun makeSet (setName) (make-node :value setName)) (defun link (item) (cons item nil)) (defun addItem (node item) (let ((root-node (findRootNode node))) (incf (node-rank root-node)) (make-node :value item :parent (link root-node)))) (defun root-node-p (node) (null (node-parent node))) (defun findRootNode (node) (if (root-node-p node) node (if (null (node-root node)) (let ((root-node (findRootNode (car (node-parent node))))) (setf (node-root node) (link root-node)) root-node) (findRootNode (car (node-root node)))))) (defun unionSet (set-A set-B) (let ((root-A (findRootNode set-A)) (root-B (findRootNode set-B))) (flet ((mergeSets (root-from root-to) (setf (node-parent root-from) (link root-to)))) (if (> (node-rank root-A) (node-rank root-B)) (mergeSets root-B root-A) (mergeSets root-A root-B))))) Questions: 1. a) Is it a convention that the set's representative has a pointer to itself, if yes, can't my solution be acceptable too. b) In common lisp, How do you define a data structure having a pointer pointing to that same data structure without falling into a circular reference? 2. In the usual path compression scheme, why updating the pointer to the representative only upon a call to find? couldn't it be more efficient to do so when inserting a node as in my code above? (or is it that we insert nodes way more often than we search) 3. Given that disjoint sets are implemented using trees where the parent has no pointer to it's children, if one wishes to list all or some of the elements member of a set, wouldn't it be better to store the nodes in a vector. Now, if we store the set's representative at (elt v 0), the worst case for getting to a node will take O(n - 1) getting to the representative will take constant time, what's then the use of keeping a tree structure? Thanks (you guy be cool, this is my first post here ;)
From: vanekl on 23 May 2010 01:11 Strav wrote: > He. I'm somewhat informally reading lectures on algorithm and > stumbling upon disjoint sets, I decided to try a quick and most > probably dirty implementation in common lisp. (in a matter of getting > acquainted to CL) - with union by depth and path compression. > > Any comment concerned with bettering this implementation is most > welcomed but more importantly, a few questions arose when I was > thinking about / writing the code, if any of you can shed some light > on them, it would be most appreciated. > > (defstruct node (value nil) (parent nil) (root nil) (rank 0)) If you are implementing the classical disjoint-set algorithm implemented as a tree you don't need both parent and root slots. A parent slot is sufficient. > (defun makeSet (setName) > (make-node :value setName)) Please don't use camel-case when writing CL. CL traditionally separates words with hyphens. > (defun link (item) > (cons item nil)) This 'link' function isn't needed. > (defun addItem (node item) > (let ((root-node (findRootNode node))) > (incf (node-rank root-node)) > (make-node :value item :parent (link root-node)))) This last line could have been written, (make-node :value item :parent root-node)))) > (defun root-node-p (node) > (null (node-parent node))) > > (defun findRootNode (node) > (if (root-node-p node) > node > (if (null (node-root node)) > (let ((root-node (findRootNode (car (node-parent node))))) The 'car' isn't needed when 'link' isn't used. > (setf (node-root node) (link root-node)) The 'link' function isn't needed. > root-node) > (findRootNode (car (node-root node)))))) > > (defun unionSet (set-A set-B) > (let ((root-A (findRootNode set-A)) > (root-B (findRootNode set-B))) > (flet ((mergeSets (root-from root-to) > (setf (node-parent root-from) (link root-to)))) 'link' not needed here, either. > (if (> (node-rank root-A) (node-rank root-B)) > (mergeSets root-B root-A) > (mergeSets root-A root-B))))) > > > Questions: > > 1. > > a) Is it a convention that the set's representative has a pointer to > itself, if yes, can't my solution be acceptable too. The way you implemented it (using nil) is more common than a self-referential pointer, but either could be used. > b) In common lisp, How do you define a data structure having a pointer > pointing to that same data structure without falling into a circular > reference? I show an example of a self-referential pointer in code below. You test with the "eql" function to make sure you haven't circled back. > 2. > In the usual path compression scheme, why updating the pointer to the > representative only upon a call to find? Because it's usually more efficient to delay processing until it becomes absolutely necessary to perform the computation. > couldn't it be more efficient to do so when inserting a node as in my > code above? Your algorithm is slightly less efficient because you perform find-root-node sooner than when it is absolutely needed and thus you may be performing more work than some algorithms require. > (or is it that we insert nodes way more often than we > search) That depends on the algorithm that this data structure is used with. > 3. > Given that disjoint sets are implemented using trees where the parent > has no pointer to it's children, if one wishes to list all or some > of the elements member of a set, wouldn't it be better to store the > nodes in a vector. The classic vector doesn't grow. Unless you know the maximum size of the vector you require, a data structure that can adjust is preferable. Some vectors are adjustable but they are harder to algorithmically analyze, which I suspect may be one reason why vectors are typically not used when pedagogically discussing this data structure. Also, this disjoint data structure has been proven to be extremely efficient when used with some graph algorithms (where your additional requirements are not needed). > Now, if we store the set's representative at (elt v > 0), the worst case for getting to a node will take O(n - 1) getting Typically this is written O(n). > to the representative will take constant time, what's then the use of > keeping a tree structure? Easy to analyze; easy to grow; much easier to dynamically adjust lists at run-time than vectors. Full-set traversal is not a requirement in many graph algorithms. > > Thanks > > (you guy be cool, this is my first post here ;) We be cool. Here's an alternate implementation that allows for traversing the elements in the set. This uses a linked-list implementation instead of trees but is nearly as fast. One of the methods requires the "iterate" package. (defstruct node (value) (next) (root) (rank 0)) (defun make-set (set-item) (make-node :value set-item)) (defun add-item (node item) (let* ((root (find-root-node node)) (new-node (make-node :value item :root root :next (node-next root)))) (incf (node-rank root)) (setf (node-next root) new-node))) (defun last-node (node) (if (null (node-next node)) node (last-node (node-next node)))) (defun list-set (node) (format t "~a " (node-value node)) (if (node-next node) (list-set (node-next node)))) (defun root-node-p (node) (null (node-root node))) (defmethod find-root-node ((node node)) (if (root-node-p node) node (setf (node-root node) (find-root-node (node-root node))))) (defmethod find-node ((sym symbol) (node node)) (if (equalp sym (node-value node)) node (iter (for next-node initially (node-next node) then (node-next next-node)) (when (or (null next-node) (eql next-node node)) (leave)) (finding next-node such-that (equalp sym (node-value next-node)))))) (defun union-set (set-A set-B) (let ((root-A (find-root-node set-A)) (root-B (find-root-node set-B))) (flet ((merge-sets (root-from root-to) (incf (node-rank root-to) (node-rank root-from)) (setf (node-root root-from) root-to (node-next (last-node root-to)) root-from))) (if (> (node-rank root-A) (node-rank root-B)) (merge-sets root-B root-A) (merge-sets root-A root-B))))) ;; This example comes out of Cormen, Leiserson, & Rivest, Intro. to Algorithms, Chap. 22. (let ((c (make-set 'c)) (f (make-set 'f))) ;; in cases where there is a loop, stop from printing in circles. (setf *print-circle* t *print-length* 256 *print-level* 2) ;; you can make a node self-referential, but it adds no value to the algo. (setf (node-root c) c) ;; and test for it with 'eql' (format t "self-referential root: ~s~%" (eql (node-root c) c)) (setf (node-root c) nil) ;; reset back to normal (add-item c 'h) (add-item c 'e) (add-item c 'b) (add-item f 'g) (add-item f 'd) (union-set c f) (format t "~%LIST SET F~%") (list-set f) (format t "~%LIST SET F/ROOT~%") (list-set (find-root-node f)) (format t "~%LIST SET C~%") (list-set (find-root-node c)) (terpri) (format t "~%find-node 'f: ~a~%" (find-node 'f f)) (format t "~%find-node 'e: ~a~%" (find-node 'e c))) OUTPUT: self-referential root: T LIST SET F F D G LIST SET F/ROOT C B E H F D G LIST SET C C B E H F D G find-node 'f: #1=#S(NODE :VALUE F :NEXT #S(NODE :VALUE D :NEXT # :ROOT #1# :RANK 0) :ROOT #S(NODE :VALUE C :NEXT # :ROOT NIL :RANK 5) :RANK 2) find-node 'e: #S(NODE :VALUE E :NEXT #S(NODE :VALUE H :NEXT # :ROOT #1=# :RANK 0) :ROOT #1# :RANK 0) If you want to see another cool way to handle sets, check out, http://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html This method uses dual vectors instead of trees or lists.
From: Strav on 23 May 2010 12:32 He. Thanks for the quality of your comments on either the code or my questions. I now know that structs store values by reference and that consing isn't needed. I should have tested that. As for my question, it's pretty much what I suspected. Btw, this sparse array scheme you linked at the end is indeed a nice trick.
From: Thomas A. Russ on 24 May 2010 14:48 Strav <essorcreations(a)gmail.com> writes: > b) In common lisp, How do you define a data structure having a pointer > pointing to that same data structure without falling into a circular > reference? Normally you would just have it point to itself. Circular data structures are not, in general, a problem in Common Lisp. If you end up printing out circular items, though, it is a good idea to make sure you have the printer variable *PRINT-CIRCLE* set to T, so that the reader will print things properly. Consider: (setq *print-circle* t) (defstruct circular self (id (gentemp))) (make-circular) => #S(CIRCULAR :SELF NIL :ID T0) (let ((c (make-circular))) (setf (circular-self c) c) c) => #1=#S(CIRCULAR :SELF #1# :ID T1) #1=#S(circular :self #1#) => #1=#S(CIRCULAR :SELF #1# :ID T2) #1=#S(circular :self #1# :id MY-ID) => #1=#S(circular :self #1# :id MY-ID) -- Thomas A. Russ, USC/Information Sciences Institute
|
Pages: 1 Prev: Number types in Lisp Next: Source Tidying via Pretty Printing |