From: Greg Heath on 22 Apr 2010 18:23 On Apr 22, 3:13 pm, Doug Schwarz <s...(a)sig.for.address.edu> wrote: > Greg Heath wrote: > > On Apr 22, 12:13 pm, Doug Schwarz <s...(a)sig.for.address.edu> wrote: > >> Greg Heath wrote: > >> > On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote: > >> >> If i have a n unknown points and n unknown clusters, is there any algorithm which will try to group the points into the most appropriate groups. > > >> >> I could use K-Means clustering algorithm but with that you have to give the predicted number of groups. For my problem, I do not know how many groups there will be - it changes each with each new run of the algorithm. > > >> > For k-means you give the maximum possible number of groups. > >> > Some may be empty. > > >> > Greg > > >> It is possible for kmeans to return empty clusters, but it is not the > >> normal behavior. Try this: > > >> s = randn(10000,2); > >> idx = kmeans(s,2); > >> gplotmatrix(s,[],idx) > > >> and you will see that kmeans has split the single cluster into two, as > >> requested. > > > ... and I have used MATLABs kmeans with empty cluster results. > > And I agreed that that can happen, but you implied that one could simply > enter the maximum number of clusters expected and kmeans will "do the > right thing" and return the "true" number of clusters. That is not > usually true and I illustrated it with a trivial example. > > >> As to the OP's original question: yes, but it's not an easy problem. > >> I'm no expert, but I have recently implemented something called > >> Superparamagnetic Clustering. It's too complicated to explain here, > > > Is it anything like the popular SuperDuperparaelectric algorithm? > > Are you trying to discredit my opinion by poking fun at the name? No, just curious as to what your reply would be. Excellent answer. Greg
From: Greg Heath on 22 Apr 2010 18:44 On Apr 22, 10:44 am, Greg Heath <he...(a)alumni.brown.edu> wrote: > On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote: > > > If i have a n unknown points and n unknown clusters, is there any algorithm which will try to group the points into the most appropriate groups. > > > I could use K-Means clustering algorithm but with that you have to give the predicted number of groups. For my problem, I do not know how many groups there will be - it changes each with each new run of the algorithm. > > For k-means you give the maximum possible number of groups. > Some may be empty. > > Greg You may want to use another kind of clustering algorithm I searched the group archives using number of clusters and found this http://groups.google.com/group/comp.soft-sys.matlab/browse_thread/thread/acf0842703ca338b/a7208630551ba4da?hl=en&lnk=gst&q=Number+of+clusters#a7208630551ba4da It was not an exhaustive search. However, you may wish to search the net yourself. Hope this helps. Greg P.S. I cannot guarantee the effectiveness of the algorithm.
From: Ashish Uthama on 23 Apr 2010 07:54
On Thu, 22 Apr 2010 22:23:40 -0300, ImageAnalyst <imageanalyst(a)mailinator.com> wrote: > On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote: >> If i have a n unknown points and n unknown clusters, is there any >> algorithm which will try to group the points into the most appropriate >> groups. >> >> I could use K-Means clustering algorithm but with that you have to give >> the predicted number of groups. For my problem, I do not know how many >> groups there will be - it changes each with each new run of the >> algorithm. >> >> Thanks > > ----------------------------------------------------------- > Aardarsh: > The QT clustering method (http://en.wikipedia.org/wiki/ > Cluster_analysis#K-means_and_derivatives) is pretty simple and it > doesn't require specifying in advance how many clusters there should > be. > > "The algorithm is: > * The user chooses a maximum diameter for clusters. > * Build a candidate cluster for each point by including the > closest point, the next closest, and so on, until the diameter of the > cluster surpasses the threshold. > * Save the candidate cluster with the most points as the first > true cluster, and remove all points in the cluster from further > consideration. Must clarify what happens if more than 1 cluster has > the maximum number of points ? > * Recurse with the reduced set of points." > > > That Wikipedia page also has a pretty good discussion of cluster > analysis in general. > -ImageAnalyst http://www.mathworks.com/matlabcentral/fileexchange/26431-qt-clustering-using-euclidean-distance |