From: mohangupta13 on 28 Apr 2010 06:29 Hello all , I am afraid that most of the times books on algorithms just explain their operations and not their practical applicability much. Now as i read we have largely 3 linear time sorting algorithms Name Input number range Space Time 1) counting sort 0- k 0(k) O(k+n) 2)Radix Sort 0- (n^d -1) O(n) O(dn) 3)Bucket sort 0-k (uniform distribution) O(n) O(n) Now i wanted to know in practice which algorithm is used where ( any example cases ?? ), as all of them largely look the same in terms of space , time and input range trade off. And adding to the confusion is that the world says "quicksort" in practice is much faster because of small constant factors , cache performance etc. Please Guide, Thanks Mohan
From: mohangupta13 on 28 Apr 2010 06:38 On Apr 28, 3:29 pm, mohangupta13 <mohangupt...(a)gmail.com> wrote: > Hello all , > > I am afraid that most of the times books on algorithms just explain > their operations and not their practical applicability much. > Now as i read we have largely 3 linear time sorting algorithms > > Name Input number > range Space Time > 1) counting sort 0- > k > 0(k) O(k+n) > 2)Radix Sort 0- (n^d > -1) > O(n) O(dn) > 3)Bucket sort 0-k (uniform > distribution) O(n) O(n) > > Now i wanted to know in practice which algorithm is used where ( any > example cases ?? ), as all of them largely look the same in terms of > space , time and input range trade off. > > And adding to the confusion is that the world says "quicksort" in > practice is much faster because of small constant factors , cache > performance etc. > > Please Guide, > Thanks > Mohan Sorry the format of the table got altered when i posted the message .. Mohan
From: robertwessel2 on 28 Apr 2010 07:17 On Apr 28, 5:29 am, mohangupta13 <mohangupt...(a)gmail.com> wrote: > Hello all , > > I am afraid that most of the times books on algorithms just explain > their operations and not their practical applicability much. > Now as i read we have largely 3 linear time sorting algorithms > > Name Input number > range Space Time > 1) counting sort 0- > k > 0(k) O(k+n) > 2)Radix Sort 0- (n^d > -1) > O(n) O(dn) > 3)Bucket sort 0-k (uniform > distribution) O(n) O(n) > > Now i wanted to know in practice which algorithm is used where ( any > example cases ?? ), as all of them largely look the same in terms of > space , time and input range trade off. > > And adding to the confusion is that the world says "quicksort" in > practice is much faster because of small constant factors , cache > performance etc. The three sorts you listed work well only with significant constraints on the input keys. For example, a two or three stage bucket or radix sort might be a good way to sort by zip code (the US postal code - a five digit number). But none of those will work well if the key is large. For example, if you wanted to sort by the name of a person. Without knowing that the range of values is severely restricted, you cannot use a counting sort to sort a 32 bit key on a 32 bit machine, since you will not have enough address space to store the counting array. The problem with almost all non-comparison based sorts is that they do not work well as a general purpose algorithm. Consider doing a 256- way bucket or radix sort on a 25 character name field (and algorithmically those will be very similar). IOW, do a 256 way distribution 25 times. You'll end up doing N*25 units of work. An O(n*log(n)) comparison sort, like Quicksort or Merge sort, will do less units of work unless there are at least 32 million input records. If the sort key is 50 characters long, Quicksort will win up to about 1E+15 records. In addition, a non-binary radix sort, as well as bucket and counting sorts usually require substantial amounts of extra storage. And if you can afford that amount of extra storage, a simple merge sort will give good performance in both the average and worst cases, no matter the size, content and distribution of the keys. And Quicksort and Heapsort both require only minimal amounts of extra storage. Which is not to say that for a given sorting task one of the above might not be ideal, it's just that they're not good choices *unless* the input datasets match the specific properties of the algorithm.
From: Dann Corbit on 28 Apr 2010 15:33 In article <d497933c-f579-4a52-8df2-854e27a94e74 @g11g2000yqe.googlegroups.com>, robertwessel2(a)yahoo.com says... > > On Apr 28, 5:29�am, mohangupta13 <mohangupt...(a)gmail.com> wrote: > > Hello all , > > > > I am afraid that most of the times books on algorithms just explain > > their operations and not their practical applicability much. > > Now as i read we have largely 3 linear time sorting algorithms > > > > Name � � � � � � � � � � � � � � � � � Input number > > range � � � � � � � � � � � � � � �Space � � � � � � � � � � � �Time > > 1) counting sort � � � � � � � � � � � 0- > > k > > 0(k) � � � � � � � � � � � � � O(k+n) > > 2)Radix Sort � � � � � � � � � � � � � �0- (n^d > > -1) > > O(n) � � � � � � � � � � � � �O(dn) > > 3)Bucket sort � � � � � � � � � � � � �0-k (uniform > > distribution) � � � � � � � � � � � O(n) � � � � � � � � � � � � �O(n) > > > > Now i wanted to know in practice which algorithm is used where ( any > > example cases ?? ), as all of them largely look the same in terms of > > space , time and input range trade off. > > > > And adding to the confusion is that the world says "quicksort" in > > practice is much faster because of small constant factors , cache > > performance etc. To the O.P.: For sorting literature, this is a very, very good place to perform searches: http://citeseerx.ist.psu.edu/ See, for example: http://citeseerx.ist.psu.edu/search?q=title%3Asorting+AND+keyword% 3Acache&sort=cite > The three sorts you listed work well only with significant constraints > on the input keys. For example, a two or three stage bucket or radix > sort might be a good way to sort by zip code (the US postal code - a > five digit number). But none of those will work well if the key is > large. For example, if you wanted to sort by the name of a person. > Without knowing that the range of values is severely restricted, you > cannot use a counting sort to sort a 32 bit key on a 32 bit machine, > since you will not have enough address space to store the counting > array. > > The problem with almost all non-comparison based sorts is that they do > not work well as a general purpose algorithm. Consider doing a 256- > way bucket or radix sort on a 25 character name field (and > algorithmically those will be very similar). IOW, do a 256 way > distribution 25 times. You'll end up doing N*25 units of work. An > O(n*log(n)) comparison sort, like Quicksort or Merge sort, will do > less units of work unless there are at least 32 million input > records. If the sort key is 50 characters long, Quicksort will win up > to about 1E+15 records. A solution to the general purpose Radix sort algorithm is to pass a "bucket" function instead of a comparison function for comparison based sorts. The bucket function returns the radix bucket[i] requested from the data. So, for 8-bit unsigned char, it would return key[i] and for unsigned long it would be key >> (i*8) & (255) {assuming radix chosen of 256}. Often, even with long data (e.g. the 50 char example) the data field is not often fully populated. So something like MSD radix sort can do quite well. If the keys are very long and also extremely similar, the comparison based sort will also have to examine a great volume of key data to differentiate. It is also possible to use Radix sort ideas as a top level sort (or a sort on the first field of a multi-key index) and then pass the sort off to another O(N*log*N)) algorithm for smaller subsets. [snip]
From: robin on 30 Apr 2010 20:48
"mohangupta13" <mohangupta13(a)gmail.com> wrote in message news:76f0b4d1-8880-48ee-84a4-5c946f6c423c(a)z13g2000prh.googlegroups.com... | Hello all , | | I am afraid that most of the times books on algorithms just explain | their operations and not their practical applicability much. | Now as i read we have largely 3 linear time sorting algorithms | | Name Input number | range Space Time | 1) counting sort 0- | k | 0(k) O(k+n) | 2)Radix Sort 0- (n^d | -1) | O(n) O(dn) | 3)Bucket sort 0-k (uniform | distribution) O(n) O(n) | | Now i wanted to know in practice which algorithm is used where ( any | example cases ?? ), as all of them largely look the same in terms of | space , time and input range trade off. | | And adding to the confusion is that the world says "quicksort" in | practice is much faster because of small constant factors , cache | performance etc. | | Please Guide, | Thanks | Mohan |