From: Greg Heath on
On Apr 22, 3:13 pm, Doug Schwarz <s...(a)sig.for.address.edu> wrote:
> Greg Heath wrote:
> > On Apr 22, 12:13 pm, Doug Schwarz <s...(a)sig.for.address.edu> wrote:
> >> Greg Heath wrote:
> >> > On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote:
> >> >> If i have a n unknown points and n unknown clusters, is there any algorithm which will try to group the points into the most appropriate groups.
>
> >> >> I could use K-Means clustering algorithm but with that you have to give the predicted number of groups. For my problem, I do not know how many groups there will be - it changes each with each new run of the algorithm.
>
> >> > For k-means you give the maximum possible number of groups.
> >> > Some may be empty.
>
> >> > Greg
>
> >> It is possible for kmeans to return empty clusters, but it is not the
> >> normal behavior.  Try this:
>
> >>    s = randn(10000,2);
> >>    idx = kmeans(s,2);
> >>    gplotmatrix(s,[],idx)
>
> >> and you will see that kmeans has split the single cluster into two, as
> >> requested.
>
> > ... and I have used MATLABs kmeans with empty cluster results.
>
> And I agreed that that can happen, but you implied that one could simply
> enter the maximum number of clusters expected and kmeans will "do the
> right thing" and return the "true" number of clusters.  That is not
> usually true and I illustrated it with a trivial example.
>
> >> As to the OP's original question: yes, but it's not an easy problem.
> >> I'm no expert, but I have recently implemented something called
> >> Superparamagnetic Clustering.  It's too complicated to explain here,
>
> > Is it anything like the popular SuperDuperparaelectric algorithm?
>
> Are you trying to discredit my opinion by poking fun at the name?  

No, just curious as to what your reply would be.

Excellent answer.

Greg
From: Greg Heath on
On Apr 22, 10:44 am, Greg Heath <he...(a)alumni.brown.edu> wrote:
> On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote:
>
> > If i have a n unknown points and n unknown clusters, is there any algorithm which will try to group the points into the most appropriate groups.
>
> > I could use K-Means clustering algorithm but with that you have to give the predicted number of groups. For my problem, I do not know how many groups there will be - it changes each with each new run of the algorithm.
>
> For k-means you give the maximum possible number of groups.
> Some may be empty.
>
> Greg

You may want to use another kind of clustering algorithm
I searched the group archives using

number of clusters

and found this

http://groups.google.com/group/comp.soft-sys.matlab/browse_thread/thread/acf0842703ca338b/a7208630551ba4da?hl=en&lnk=gst&q=Number+of+clusters#a7208630551ba4da

It was not an exhaustive search. However, you may wish
to search the net yourself.

Hope this helps.

Greg

P.S. I cannot guarantee the effectiveness of the algorithm.
From: Ashish Uthama on
On Thu, 22 Apr 2010 22:23:40 -0300, ImageAnalyst
<imageanalyst(a)mailinator.com> wrote:

> On Apr 22, 9:01 am, "Aadarsh " <imfromwa...(a)hotmail.com> wrote:
>> If i have a n unknown points and n unknown clusters, is there any
>> algorithm which will try to group the points into the most appropriate
>> groups.
>>
>> I could use K-Means clustering algorithm but with that you have to give
>> the predicted number of groups. For my problem, I do not know how many
>> groups there will be - it changes each with each new run of the
>> algorithm.
>>
>> Thanks
>
> -----------------------------------------------------------
> Aardarsh:
> The QT clustering method (http://en.wikipedia.org/wiki/
> Cluster_analysis#K-means_and_derivatives) is pretty simple and it
> doesn't require specifying in advance how many clusters there should
> be.
>
> "The algorithm is:
> * The user chooses a maximum diameter for clusters.
> * Build a candidate cluster for each point by including the
> closest point, the next closest, and so on, until the diameter of the
> cluster surpasses the threshold.
> * Save the candidate cluster with the most points as the first
> true cluster, and remove all points in the cluster from further
> consideration. Must clarify what happens if more than 1 cluster has
> the maximum number of points ?
> * Recurse with the reduced set of points."
>
>
> That Wikipedia page also has a pretty good discussion of cluster
> analysis in general.
> -ImageAnalyst

http://www.mathworks.com/matlabcentral/fileexchange/26431-qt-clustering-using-euclidean-distance