From: Tom Anderson on 16 May 2010 06:28 On Sat, 15 May 2010, Arne Vajh?j wrote: > On 15-05-2010 19:47, markspace wrote: >> Arne Vajh?j wrote: >>> } else { >>> ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1, tdepth); >>> ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1, tdepth); >> >> Just a thought: Why spawn two threads here? You already have one thread >> running (the current one). Spawn one thread (proabaly on the larger span >> between pivot and start/end) and sort the other bit recursively in the >> current thread. Then just wait on the other thread after you're done >> with the recursive sort. > > It may be more efficient. But I like the symmetric approach of > this code. This would be an ideal use for the JSR-166y fork-join framework, which lets you express the two branches symmetrically, but have the executed without unnecessary thread creations. If you can't wait for java 7, get it from Doug Lea: http://gee.cs.oswego.edu/dl/concurrency-interest/index.html >> However, I played around with this myself a while ago, and I'm not sure >> the whole idea is a great one. Quicksorts are god-awful at sorting many >> real world types of data. If a list is already sorted, then quicksort >> degenerates into a bubble sort. That's N^2 performance. And pre-sorted >> data is pretty common so you're likely to hit this issue often. > > Not quite correct. > > Quicksort degenerate into O(n^2) for sorted data *if* the first > or last element is used as pivot. > > My code used the middle element. > > For the very same reason. > > That works super for sorted data. > > Ofcourse there still exists some orders that degenerate > into O(n^2), but they are not very likely (unlike the already > sorted order). All correct. But for real-world data, even tweakedd quicksorts are routinely beaten by adaptive sorts like Timsort. Of course, what's relevant right here is that quicksort can be parallelised - i'm not aware that Timsort can be, and i have no idea about any other adaptive sorts. tom -- eviscerated by obfuscation
From: Mike Amling on 16 May 2010 10:01 markspace wrote: > Arne Vajh�j wrote: >> } else { >> ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1, >> tdepth); >> ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1, >> tdepth); > > > Just a thought: Why spawn two threads here? You already have one > thread running (the current one). Spawn one thread (proabaly on the > larger span between pivot and start/end) and sort the other bit > recursively in the current thread. Then just wait on the other thread > after you're done with the recursive sort. > > > However, I played around with this myself a while ago, and I'm not sure > the whole idea is a great one. Quicksorts are god-awful at sorting many > real world types of data. If a list is already sorted, then quicksort > degenerates into a bubble sort. That's N^2 performance. And pre-sorted > data is pretty common so you're likely to hit this issue often. While the original QuickSort used the first element as the pivot, an improved variant swaps the first, last and middle elements until they're in order, then uses the resulting middle element as the pivot. Those swaps are over long distances, and hence are consistent with the spirit of QuickSort. This improvement was published shortly after the original QuickSort, back in the '60s. The improved algorithm is robust over lots of original orders of the data, probably over all data that hasn't been deliberately arranged to foil QS. --Mike Amling
From: Mike Amling on 16 May 2010 10:15 Arne Vajh�j wrote: > On 15-05-2010 11:35, Jon Harrop wrote: >> I cannot find an implementation of parallel quicksort in Java. Does >> anyone have one or know where I can get one? > > public static void tqsint_help(int n1, int n2, int[] ia, int depth, > int tdepth) { .... > > if (depth >= tdepth) { > if (n1 < r) > tqsint_help(n1, r, ia, depth + 1, tdepth); > if (l < n2) > tqsint_help(l, n2, ia, depth + 1, tdepth); > } else { > ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1, > tdepth); > ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1, > tdepth); > ... Rather than make the caller guess a good value for tdepth, I think you'd be better off with if (r-n1>WORTH_THREADING) { ...new TheadSortHelp(... } else if (n1<r) { tqsint_help(... } and similar for l, n2. The partition sizes can vary, and you could find in some partitions that n1>=r before depth>=tdepth. --Mike Amling
From: Eric Sosman on 16 May 2010 10:44 On 5/16/2010 10:01 AM, Mike Amling wrote: > > While the original QuickSort used the first element as the pivot, an > improved variant swaps the first, last and middle elements until they're > in order, then uses the resulting middle element as the pivot. Those > swaps are over long distances, and hence are consistent with the spirit > of QuickSort. This improvement was published shortly after the original > QuickSort, back in the '60s. The improved algorithm is robust over lots > of original orders of the data, probably over all data that hasn't been > deliberately arranged to foil QS. Look up "Engineering a Sort Function" by Bentley and McIlroy for a real-world data set that drove a widely-used real-world qsort() implementation completely over the edge. While studying the issue, Bentley and McIlroy learned "In fact, among a dozen different Unix libraries we found no qsort that could not easily be driven to quadratic behavior [...]" After a good deal of work, Bentley and McIlroy settled on a threefold scheme, with the breakpoints determined emprically: - For "small" arrays or sub-arrays, use the middle element - For "medium" arrays, use the median of the first, middle, and last elements - For "large" arrays, choose nine evenly-spread elements, take them in groups of three and find the three medians, and use the median of those medians They also lay a good deal of stress on the practical importance of using the "Dutch National Flag" partitioning scheme. Equal keys, they found, turn out to be quite common in real-world data sets. -- Eric Sosman esosman(a)ieee-dot-org.invalid
From: blmblm on 29 May 2010 15:11 In article <85ad3bFkk5U1(a)mid.individual.net>, Mike Amling <mamling(a)rmcis.com> wrote: > Arne Vajh�j wrote: > > On 15-05-2010 11:35, Jon Harrop wrote: > >> I cannot find an implementation of parallel quicksort in Java. Does > >> anyone have one or know where I can get one? > > > > public static void tqsint_help(int n1, int n2, int[] ia, int depth, > > int tdepth) { > ... > > > > if (depth >= tdepth) { > > if (n1 < r) > > tqsint_help(n1, r, ia, depth + 1, tdepth); > > if (l < n2) > > tqsint_help(l, n2, ia, depth + 1, tdepth); > > } else { > > ThreadSortHelp h1 = new ThreadSortHelp(n1, r, ia, depth + 1, > > tdepth); > > ThreadSortHelp h2 = new ThreadSortHelp(l, n2, ia, depth + 1, > > tdepth); > > ... > > Rather than make the caller guess a good value for tdepth, I think > you'd be better off with > if (r-n1>WORTH_THREADING) { > ...new TheadSortHelp(... > } else if (n1<r) { > tqsint_help(... > } > and similar for l, n2. > The partition sizes can vary, and you could find in some partitions > that n1>=r before depth>=tdepth. (Okay, oldish thread, but -- yeah well. I've been playing lately with various parallel sorts, and am at a good break point to comment here, maybe.) A hearty "indeed" on Mike's closing remark. A somewhat related point is that limiting the number of threads to the number of available processors/cores, as is somewhat standard when adding multithreading for performance, may lead to very poor load balance (and hence poor performance), since there's no guarantee even for random data that quicksort partitioning produces two pieces of roughly equal size. This can lead to startling results, such as situations in which going from one thread to two yields no benefit, but going from two threads to four does. It does help a bit to take Mike's advice about only creating new threads if the amount of work is "enough". I've had more success, though, with setting up a thread pool (using the classes in java.util.concurrent) and creating new tasks to be assigned to the thread pool rather than starting new threads -- and again, only doing so if the amount of work is past some threshold. I'm still tinkering a bit with how to specify the threshold, but preliminary results are that with this approach you get somewhat more predictable behavior, in that increasing the number of threads improves better performance in a fairly consistent way. Just sayin', maybe. (And I'd post my code, but it's part of a larger experiment and hence maybe more complicated than it needs to be, and I'm too lazy to extract a postable version.) -- B. L. Massingill ObDisclaimer: I don't speak for my employers; they return the favor.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: adding a new library .jar to your path (netbeans) Next: Turning off JIT Optimisation |