Parallel quicksort [Java Programming]

Prev: adding a new library .jar to your path (netbeans)
Next: Turning off JIT Optimisation

From: Tom Anderson on 16 May 2010 06:28

On Sat, 15 May 2010, Arne Vajh?j wrote:

> On 15-05-2010 19:47, markspace wrote:
>> Arne Vajh?j wrote:
>>> } else {
>>> ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1, tdepth);
>>> ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1, tdepth);
>>
>> Just a thought: Why spawn two threads here? You already have one thread
>> running (the current one). Spawn one thread (proabaly on the larger span
>> between pivot and start/end) and sort the other bit recursively in the
>> current thread. Then just wait on the other thread after you're done
>> with the recursive sort.
>
> It may be more efficient. But I like the symmetric approach of
> this code.

This would be an ideal use for the JSR-166y fork-join framework, which
lets you express the two branches symmetrically, but have the executed
without unnecessary thread creations. If you can't wait for java 7, get it
from Doug Lea:

http://gee.cs.oswego.edu/dl/concurrency-interest/index.html

>> However, I played around with this myself a while ago, and I'm not sure
>> the whole idea is a great one. Quicksorts are god-awful at sorting many
>> real world types of data. If a list is already sorted, then quicksort
>> degenerates into a bubble sort. That's N^2 performance. And pre-sorted
>> data is pretty common so you're likely to hit this issue often.
>
> Not quite correct.
>
> Quicksort degenerate into O(n^2) for sorted data *if* the first
> or last element is used as pivot.
>
> My code used the middle element.
>
> For the very same reason.
>
> That works super for sorted data.
>
> Ofcourse there still exists some orders that degenerate
> into O(n^2), but they are not very likely (unlike the already
> sorted order).

All correct. But for real-world data, even tweakedd quicksorts are
routinely beaten by adaptive sorts like Timsort. Of course, what's
relevant right here is that quicksort can be parallelised - i'm not aware
that Timsort can be, and i have no idea about any other adaptive sorts.

tom

--
eviscerated by obfuscation

From: Mike Amling on 16 May 2010 10:01

markspace wrote:
> Arne Vajh�j wrote:
>> } else {
>> ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1,
>> tdepth);
>> ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1,
>> tdepth);
>
>
> Just a thought: Why spawn two threads here? You already have one
> thread running (the current one). Spawn one thread (proabaly on the
> larger span between pivot and start/end) and sort the other bit
> recursively in the current thread. Then just wait on the other thread
> after you're done with the recursive sort.
>
>
> However, I played around with this myself a while ago, and I'm not sure
> the whole idea is a great one. Quicksorts are god-awful at sorting many
> real world types of data. If a list is already sorted, then quicksort
> degenerates into a bubble sort. That's N^2 performance. And pre-sorted
> data is pretty common so you're likely to hit this issue often.

While the original QuickSort used the first element as the pivot, an
improved variant swaps the first, last and middle elements until they're
in order, then uses the resulting middle element as the pivot. Those
swaps are over long distances, and hence are consistent with the spirit
of QuickSort. This improvement was published shortly after the original
QuickSort, back in the '60s. The improved algorithm is robust over lots
of original orders of the data, probably over all data that hasn't been
deliberately arranged to foil QS.

--Mike Amling

From: Mike Amling on 16 May 2010 10:15

Arne Vajh�j wrote:
> On 15-05-2010 11:35, Jon Harrop wrote:
>> I cannot find an implementation of parallel quicksort in Java. Does
>> anyone have one or know where I can get one?
>
> public static void tqsint_help(int n1, int n2, int[] ia, int depth,
> int tdepth) {
....
>
> if (depth >= tdepth) {
> if (n1 < r)
> tqsint_help(n1, r, ia, depth + 1, tdepth);
> if (l < n2)
> tqsint_help(l, n2, ia, depth + 1, tdepth);
> } else {
> ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1,
> tdepth);
> ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1,
> tdepth);
> ...

Rather than make the caller guess a good value for tdepth, I think
you'd be better off with
if (r-n1>WORTH_THREADING) {
...new TheadSortHelp(...
} else if (n1<r) {
tqsint_help(...
}
and similar for l, n2.
The partition sizes can vary, and you could find in some partitions
that n1>=r before depth>=tdepth.

--Mike Amling

From: Eric Sosman on 16 May 2010 10:44

On 5/16/2010 10:01 AM, Mike Amling wrote:
>
> While the original QuickSort used the first element as the pivot, an
> improved variant swaps the first, last and middle elements until they're
> in order, then uses the resulting middle element as the pivot. Those
> swaps are over long distances, and hence are consistent with the spirit
> of QuickSort. This improvement was published shortly after the original
> QuickSort, back in the '60s. The improved algorithm is robust over lots
> of original orders of the data, probably over all data that hasn't been
> deliberately arranged to foil QS.

Look up "Engineering a Sort Function" by Bentley and McIlroy
for a real-world data set that drove a widely-used real-world
qsort() implementation completely over the edge. While studying
the issue, Bentley and McIlroy learned

"In fact, among a dozen different Unix libraries we found
no qsort that could not easily be driven to quadratic
behavior [...]"

After a good deal of work, Bentley and McIlroy settled on a
threefold scheme, with the breakpoints determined emprically:

- For "small" arrays or sub-arrays, use the middle element

- For "medium" arrays, use the median of the first, middle,
and last elements

- For "large" arrays, choose nine evenly-spread elements,
take them in groups of three and find the three medians,
and use the median of those medians

They also lay a good deal of stress on the practical importance
of using the "Dutch National Flag" partitioning scheme. Equal keys,
they found, turn out to be quite common in real-world data sets.

--
Eric Sosman
esosman(a)ieee-dot-org.invalid

From: blmblm on 29 May 2010 15:11

In article <85ad3bFkk5U1(a)mid.individual.net>,
Mike Amling <mamling(a)rmcis.com> wrote:
> Arne Vajh�j wrote:
> > On 15-05-2010 11:35, Jon Harrop wrote:
> >> I cannot find an implementation of parallel quicksort in Java. Does
> >> anyone have one or know where I can get one?
> >
> > public static void tqsint_help(int n1, int n2, int[] ia, int depth,
> > int tdepth) {
> ...
> >
> > if (depth >= tdepth) {
> > if (n1 < r)
> > tqsint_help(n1, r, ia, depth + 1, tdepth);
> > if (l < n2)
> > tqsint_help(l, n2, ia, depth + 1, tdepth);
> > } else {
> > ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1,
> > tdepth);
> > ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1,
> > tdepth);
> > ...
>
> Rather than make the caller guess a good value for tdepth, I think
> you'd be better off with
> if (r-n1>WORTH_THREADING) {
> ...new TheadSortHelp(...
> } else if (n1<r) {
> tqsint_help(...
> }
> and similar for l, n2.
> The partition sizes can vary, and you could find in some partitions
> that n1>=r before depth>=tdepth.

(Okay, oldish thread, but -- yeah well. I've been playing lately with
various parallel sorts, and am at a good break point to comment here,
maybe.)

A hearty "indeed" on Mike's closing remark. A somewhat related
point is that limiting the number of threads to the number of
available processors/cores, as is somewhat standard when adding
multithreading for performance, may lead to very poor load balance
(and hence poor performance), since there's no guarantee even
for random data that quicksort partitioning produces two pieces
of roughly equal size. This can lead to startling results,
such as situations in which going from one thread to two yields
no benefit, but going from two threads to four does.

It does help a bit to take Mike's advice about only creating
new threads if the amount of work is "enough". I've had more
success, though, with setting up a thread pool (using the classes
in java.util.concurrent) and creating new tasks to be assigned to
the thread pool rather than starting new threads -- and again,
only doing so if the amount of work is past some threshold.
I'm still tinkering a bit with how to specify the threshold, but
preliminary results are that with this approach you get somewhat
more predictable behavior, in that increasing the number of
threads improves better performance in a fairly consistent way.

Just sayin', maybe. (And I'd post my code, but it's part of a
larger experiment and hence maybe more complicated than it needs
to be, and I'm too lazy to extract a postable version.)

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.

First | Prev | Next | Last
Pages: 1 2 3
Prev: adding a new library .jar to your path (netbeans)
Next: Turning off JIT Optimisation