From: Hans-Friedrich on
Hi there,

perhaps, someone can sort me out?

Assume you have a set of J objects, and you want to generate all possible (permuted) subsets of size j = 2, ..., J-1 selected from the set of J objects (the subsets should at least contain 2 objects, and the set containing all J objects is trivial) --- I used the "permpos.m" function (downloadable at http://www.mathworks.com/matlabcentral/fileexchange/11216-permpos). For example, for j = 2, we obtain

1 1 0 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0
...
0 0 0 0 0 0 0 1 0 1
0 0 0 0 0 0 0 0 1 1

(where 0 = object is disregarded; 1 = object is in the subset), with a total of 45 different (permuted) subsets for j=2.

Now, for J=10, the number of all possible (permuted) subsets for j = 2, ..., J-1 equals 1012, which can be collected into a binary matrix, say M (i.e., each row represents a specific permuted subset).

On each specific subset (i.e., row of M) the same operations are performed (mostly vectorized --- hence, number of 1's in a given row of M should not --- and, as it looks, does not --- make a difference). MATLAB processed all sets in M within the proverbial "blink of an eye".

Larger J (e.g., 40 plus) cause a memory problem. So, I decided to "chop up" the problem and loop the generation of subsets such that for each j a specific M is generated (followed by said operations performed on each row of M).

To my utmost surprise, when I compared the CPU times for the two versions of the algorithm for the same J=10, the "chopped-up" version took almost 20 minutes (multiply/divide by X, depending on the machine you're using). I'm absolutely clueless about this immense increase in CPU time. Any suggestions, hints, etc. would be greatly appreciated!

Cheers,

Hans-Friedrich Koehn
From: Steven Lord on

"Hans-Friedrich " <koehnh(a)REMOVETHISmissouri.edu> wrote in message
news:hul5vk$qlc$1(a)fred.mathworks.com...
> Hi there,
>
> perhaps, someone can sort me out?
>
> Assume you have a set of J objects, and you want to generate all possible
> (permuted) subsets of size j = 2, ..., J-1 selected from the set of J
> objects (the subsets should at least contain 2 objects, and the set
> containing all J objects is trivial) --- I used the "permpos.m" function
> (downloadable at
> http://www.mathworks.com/matlabcentral/fileexchange/11216-permpos). For
> example, for j = 2, we obtain
>
> 1 1 0 0 0 0 0 0 0 0
> 1 0 1 0 0 0 0 0 0 0
> ...
> 0 0 0 0 0 0 0 1 0 1
> 0 0 0 0 0 0 0 0 1 1
>
> (where 0 = object is disregarded; 1 = object is in the subset), with a
> total of 45 different (permuted) subsets for j=2.
>
> Now, for J=10, the number of all possible (permuted) subsets for j = 2,
> ..., J-1 equals 1012, which can be collected into a binary matrix, say M
> (i.e., each row represents a specific permuted subset).
> On each specific subset (i.e., row of M) the same operations are performed
> (mostly vectorized --- hence, number of 1's in a given row of M should
> not --- and, as it looks, does not --- make a difference). MATLAB
> processed all sets in M within the proverbial "blink of an eye".
> Larger J (e.g., 40 plus) cause a memory problem. So, I decided to "chop
> up" the problem and loop the generation of subsets such that for each j a
> specific M is generated (followed by said operations performed on each row
> of M).

There are 2^J subsets of a set with J elements. As such, you could
represent each subset as a binary number between 0 and (2^J)-1 inclusive.
Of those 2^J subsets, 1 contains 0 elements, J contain 1 element, and 1
contains J elements. Since you want to exclude the 0-element, 1-element,
and J-element subsets the number of subsets you want to work with in the
case of J = 40 is 2^40-40-2 or about 1e12 subsets.

Processing each of those subsets at a rate of 100 subsets per second would
require just under 350 years.

Unless you have a very large cluster of machines at your disposal or you can
process the subsets at a much faster rate than 100 per second, I would
rethink this approach to solving the problem for which you're using this
subset approach as a solution technique.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on
http://www.mathworks.com