quick sort and O(nlgn) worst case bound [General Programming]

Prev: hi its a very useful blog visit this blog
Next: How to find which system call the process in?

From: pete on 21 Dec 2009 07:55

Stephen Howe wrote:
> On Sat, 28 Nov 2009 07:23:35 -0800 (PST), mohangupta13 <mohangupta13(a)gmail.com> wrote:
>
>
>>hello all ,
>>Using median of median or just median to choose the pivot in quick
>>sort it can be deterministically proved that the worst case running
>>time of quicksort is O(nlgn) and not O(n^2) .
>>
>>So why is that quick sort is still blamed to be having a worst case of
>>O(n^2).??
>>Is there some problem with the above mentioned methods of choosing the
>>pivot ?
>
>
> Yes, they are O(N). All those O(N) calls add up.
> Most pivots for Quicksort are chosen in O(1)
>
> But this is academic. If you use Musser's hybrid Introsort, it wil lbe Quicksort with O(1) pivots and Heapsort where it blows
> up. That guarantees O(N * lg N) performance in worst case and close to Quicksort in the average case.

Outside of introsort, I'm unfamiliar with any quicksort
which is truly O(N * log(N)) for worst case.

A qsort type quicksort function called mqsort,
has a worst case (reverse ordered array) that takes
1.450 times as many comparison calls as its average case
when N equals 1024 and
1.468 times as many comparison calls as its average case
when N equals 32768.
That seems pretty close to O(N * log(N)) worst case, but not exactly there.

A very interesting sort is leapfrogging sample sort,
which has the amazing property of making the same number of compar
calls on a randomized array as mergesort does.
However it is O(N * log(N) * log(N)) worst case.
http://www.springerlink.com/content/p70564506802n575/
mergesort does theoretical lower information bound comparisons.
sfsort, which is a qsort type leapfrogging sample sort function
taken from the code here:
http://www.adnu.edu.ph/ccs/NCITE08/KeynoteLecture_Albacea_NewSortingLowerBound.pdf

Here in sfsort_test.c are:
mqsort,
sfsort,
and also included for contrast are:
m_sort, (a mergesort )
q_sort, (a very simple quicksort)

/* BEGIN sfsort_test.c */

#include <stdio.h>
#include <stdlib.h>
/*
** This program compares various qsort type
** array sorting functions, by counting compar calls.
*/
#define ARRAY_ELEM (32768 / 32)
#define TRIALS 10
/*
** Make sure that SEQUENCE and SEQUENCE_STR match.
*/
#define SEQUENCE RANDOM
#define SEQUENCE_STR "RANDOM"
/*
** #define SEQUENCE RANDOM
** #define SEQUENCE FORWARD
** #define SEQUENCE REVERSE
*/
/*
** Decrease KEY_RANGE, if more equal keys are wanted.
*/
#define KEY_RANGE 0XFFFFFFFFLU /* 0X1LU, 0XFFFFFFFFLU */
/* sfsort */ /*
** Leap frog sample quicksort
*/
/* m_sort */ /*
** stable mergesort,
** Number of comparisons:
** best case: N * log2(N) / 2
** worst case: N * log2(N) - N + 1
*/
/* q_sort */ /*
** Simple Lomuto qsort
*/
/* qsort */ /*
** Standard library qsort
** defined by your C implementation
*/
#define FUNCTIONS_AND_STRINGS { \
{sfsort, "sfsort"}, \
{mqsort, "mqsort"}, \
{m_sort, "m_sort"}, \
{q_sort, "q_sort"}, \
{qsort, " qsort"} \
}
#define BYTE_SWAP(A, B, S) \
{ \
p1 = (A); \
p2 = (B); \
end = p2 + (S); \
do { \
const unsigned char swap = *p1; \
\
*p1++ = *p2; \
*p2++ = swap; \
} while (p2 != end); \
}
#define NUM_LEN 12
#define SEED_RAND 123456789
#define RANDOM (lus_seed = LU_RAND(lus_seed) % KEY_RANGE)
#define FORWARD element
#define REVERSE (0LU - 1 - element)
#define LU_RAND(S) ((S) * 69069LU + 362437 & 0XFFFFFFFFLU)
#define NMEMB(A) (sizeof A / sizeof *A)
#define str(s) # s
#define xstr(s) str(s)

void
sfsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
m_sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
q_sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
mqsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
free_arrays(long unsigned **array, size_t nmemb);
int
comparison(const void *arg1, const void *arg2);
int
comparray(long unsigned *s1, long unsigned *s2, size_t nmemb,
int (*compar)(const void *, const void *));

static void
sfrog(size_t s1, size_t ss,
void *base, size_t u,
size_t size, int (*compar)(const void *, const void *));
static void
merg(unsigned char *buff, unsigned char *base,
size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
static void
m2sort(size_t sorted, void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
static void
block_swap(void *first, void *middle,
void *last, size_t size);
static void
reverse(unsigned char *first, unsigned char *last, size_t size);

long unsigned Comps;

int main(void)
{
size_t sort, element, n;
struct {
void (*func)(void *, size_t, size_t,
int (*)(const void *, const void *));
char *name;
} F[] = FUNCTIONS_AND_STRINGS;
long unsigned *array[NMEMB(F)];
long unsigned trial,
totalC[NMEMB(F)], CompCounter[NMEMB(F)];
long unsigned lus_seed = SEED_RAND;

for (n = 0; n != NMEMB(array); ++n) {
array[n] = malloc(ARRAY_ELEM * sizeof *array[n]);
if (array[n] == NULL) {
free_arrays(array, n);
fputs("\nmalloc() array failure\n", stderr);
exit(EXIT_FAILURE);
}
totalC[n] = CompCounter[n] = 0;
}
puts("/* BEGIN sfsort_test.c output */\n");
puts("Counting compar calls "
"for different qsort type functions,");
printf("sorting arrays of %lu keys.\n\n"
, (long unsigned)ARRAY_ELEM);
puts("SEQUENCE is " SEQUENCE_STR "\n");
printf(" ");
for (n = 0; n != NMEMB(array); ++n) {
printf(" %" xstr(NUM_LEN) "s", F[n].name);
}
printf("\n Trial:\n");
for (trial = 0; trial != TRIALS; ++trial) {
element = ARRAY_ELEM;
while (element-- != 0) {
array[0][element] = SEQUENCE;
for (sort = 1; NMEMB(F) > sort; ++sort) {
array[sort][element] = array[0][element];
}
}
for (n = 0; n != NMEMB(array); ++n) {
Comps = 0;
F[n].func(array[n], ARRAY_ELEM
, sizeof *array[n], comparison);
CompCounter[n] = Comps;
totalC[n] += CompCounter[n];
}
printf("%6lu:", trial + 1);
for (n = 0; n != NMEMB(array); ++n) {
printf(" %" xstr(NUM_LEN) "lu", CompCounter[n]);
}
putchar('\n');
for (sort = 1; NMEMB(F) > sort; ++sort) {
if (comparray(array[0], array[sort]
, ARRAY_ELEM, comparison))
{
fputs("\nSort discrepancy!!!\n", stderr);
free_arrays(array, n);
exit(EXIT_SUCCESS);
}
}
}
free_arrays(array, NMEMB(array));
printf("\n%lu Trial%s, %lu keys, "
"%lu possible different key value%s.\n\n"
, trial, trial == 1 ? "" : "s"
, (long unsigned)ARRAY_ELEM, KEY_RANGE
, KEY_RANGE == 1 ? "" : "s");
for (n = 0; n != NMEMB(array); ++n) {
printf("total %s comparisons: %" xstr(NUM_LEN) "lu\n"
, F[n].name, totalC[n]);
}
puts("\n/* END sfsort_test.c output */");
return 0;
}

void
sfsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
size_t s;

if (nmemb > 1) {
for (s = 1; nmemb - s > s; s += s + 1) {
sfrog(0, s - 1, base, s + s, size, compar);
}
sfrog(0, s - 1, base, nmemb - 1, size, compar);
}
}

static void
sfrog(size_t s1, size_t ss,
void *base, size_t u, size_t size,
int (*compar)(const void *, const void *))
{
unsigned char *p1, *p2, *end;
unsigned char *const A = base;

if ((size_t)(s1 - ss) != 1) {
if (u > ss) {
const size_t sm = (ss + s1) / 2;
const size_t sms = sm * size;
size_t j = (u + 1) * size;
size_t i = ss * size;

do {
i += size;
} while (j > i && compar(A + sms, A + i) > 0);
do {
j -= size;
} while (j > i && compar(A + j, A + sms) > 0);
while (j > i) {
BYTE_SWAP(A + j, A + i, size);
do {
i += size;
} while (j > i && compar(A + sms, A + i) > 0);
do {
j -= size;
} while (j > i && compar(A + j, A + sms) > 0);
}
if (j == i) {
j -= size;
}
i = ss * size;
if (j > i) {
size_t k = j;

while (i + size > sms) {
BYTE_SWAP(A + i, A + k, size);
k -= size;
i -= size;
}
j /= size;
} else {
j = ss;
}
sfrog(s1, sm - 1, A, sm + j - ss - 1, size, compar);
sfrog(sm + j - ss + 1, j, A, u, size, compar);
}
} else {
sfsort(A + s1 * size, u - ss, size, compar);
}
}

void
m_sort(void *base,
size_t nmemb,
size_t size,
int (*compar)(const void *, const void *))
{
if (nmemb > 1) {
void *const buff = malloc(nmemb / 2 * size);

if (buff != NULL) {
merg(buff, base, nmemb, size, compar);
free(buff);
} else {
puts("\nmalloc failure in m_sort function.\n");
}
}
}

static void
merg(unsigned char *buff,
unsigned char *base,
size_t nmemb,
size_t size,
int (*compar)(const void *, const void *))
{
if (nmemb > 1) {
const void *const after = nmemb * size + base;
const size_t half_nmemb = nmemb / 2;
const size_t half_bytes = half_nmemb * size;
unsigned char *midd = base + half_bytes;

merg(buff, midd, nmemb - half_nmemb, size, compar);
merg(buff, base, half_nmemb, size, compar);
do {
if (compar(base, midd) > 0) {
const unsigned char *end = midd;

buff += half_bytes;
do {
*--buff = *--end;
} while (end != base);
end += size;
do {
*base++ = *midd++;
} while (base != end);
do {
if (midd != after) {
end += size;
if (compar(buff, midd) > 0) {
do {
*base++ = *midd++;
} while (base != end);
} else {
do {
*base++ = *buff++;
} while (base != end);
}
} else {
do {
*base++ = *buff++;
} while (base != midd);
}
} while (base != midd);
} else {
base += size;
}
} while (base != midd);
}
}

void
q_sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
unsigned char *p1, *p2, *end;
unsigned char *left;
size_t nmemb_right, middle, last, right;

left = base;
while (nmemb-- > 1) {
right = nmemb * size;
last = middle = 0;
do {
middle += size;
if (compar(left, left + middle) > 0) {
last += size;
BYTE_SWAP(left + middle, left + last, size);
}
} while (middle != right);
BYTE_SWAP(left, left + last, size);
nmemb = last / size;
nmemb_right = (right - last) / size;
if (nmemb_right > nmemb) {
q_sort(left, nmemb, size, compar);
left += last + size;
nmemb = nmemb_right;
} else {
q_sort(left + last + size, nmemb_right, size, compar);
}
}
}

void
mqsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
m2sort(1, base, nmemb, size, compar);
}

static void
m2sort(size_t sorted, void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
if (sorted == 0) {
sorted = 1;
}
if (nmemb > sorted) {
size_t pivot;
size_t sorted_bytes;
size_t midd;
unsigned char *const left = base;
const size_t right = (nmemb - 1) * size;
size_t last = right;

while (nmemb - sorted > sorted) {
m2sort(sorted, base, sorted * 2, size, compar);
sorted *= 2;
}
pivot = sorted / 2 * size;
sorted_bytes = sorted * size;
midd = sorted_bytes;
while (last > midd && compar(left + pivot, left + midd) > 0) {
midd += size;
}
while (last >= midd && compar(left + last, left + pivot) > 0) {
last -= size;
}
while (last > midd) {
unsigned char *p1, *p2, *end;

BYTE_SWAP(left + midd, left + last, size);
do {
midd += size;
} while (midd != last && compar(left + pivot, left + midd)
> 0);
do {
last -= size;
} while (last >= midd && compar(left + last, left + pivot)
> 0);
}
if (last >= sorted_bytes) {
size_t nmemb_right;

block_swap(left + pivot, left + sorted_bytes, left + last,
size);
pivot += last - sorted_bytes + size;
nmemb_right = (right - pivot) / size;
nmemb = pivot / size;
m2sort(sorted / 2,
left, nmemb, size, compar);
m2sort((sorted - 1) / 2,
left + pivot + size, nmemb_right, size, compar);
} else {
m2sort((sorted - 1) / 2,
left + pivot + size, nmemb - sorted / 2 - 1, size, compar);
}
}
}

static void
block_swap(void *first, void *middle,
void *last, size_t size)
{
unsigned char *const m = middle;

reverse(first, m - size, size);
reverse(middle, last, size);
reverse(first, last, size);
}

static void
reverse(unsigned char *first, unsigned char *last, size_t size)
{
unsigned char *p1, *p2, *end;

while (last > first) {
BYTE_SWAP(first, last, size);
first += size;
last -= size;
}
}

int
comparison(const void *arg1, const void *arg2)
{
++Comps;
return *(const long unsigned *)arg2
> *(const long unsigned *)arg1 ? -1
: *(const long unsigned *)arg2
!= *(const long unsigned *)arg1;
}

void
free_arrays(long unsigned **array, size_t nmemb)
{
while (nmemb-- != 0) {
free(array[nmemb]);
}
}

int
comparray(long unsigned *s1, long unsigned *s2, size_t nmemb,
int (*compar)(const void *, const void *))
{
int rc = 0;

while (nmemb--) {
rc = compar(s1, s2);
if (rc != 0) {
break;
}
++s1;
++s2;
}
return rc;
}

/* END sfsort_test.c */

--
pete

From: Willem on 21 Dec 2009 13:29

pete wrote:
) Stephen Howe wrote:
)> Yes, they are O(N). All those O(N) calls add up.
)> Most pivots for Quicksort are chosen in O(1)
)>
)> But this is academic. If you use Musser's hybrid Introsort, it wil lbe Quicksort with O(1) pivots and Heapsort where it blows
)> up. That guarantees O(N * lg N) performance in worst case and close to Quicksort in the average case.
)
) Outside of introsort, I'm unfamiliar with any quicksort
) which is truly O(N * log(N)) for worst case.

There is an O(N) algorithm to find the median of an array.
Using that, you can get O(N * log(N)) quicksort.
But that is purely academical, because with that algorithm, the
constant factor becomes much much worse than other O(NlgN) sorts.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

From: John Slimick on 21 Dec 2009 17:20

On 2009-12-21, Willem <willem(a)stack.nl> wrote:
> pete wrote:
>
> There is an O(N) algorithm to find the median of an array.
> Using that, you can get O(N * log(N)) quicksort.
> But that is purely academical, because with that algorithm, the
> constant factor becomes much much worse than other O(NlgN) sorts.
>
>
> SaSW, Willem

My recollection is that older implementations of QS
were O(n*lg n) but that worst cases (sorting a constant
array) were more like O(n ^ 2). This is why for years
people depended on binary merge since the time was
almost always the same -- also O(n*lg n) -- for
the same size arrays.

john slimick
university of pittsburgh at bradford
slimick(a)pitt.edu

From: Stephen Howe on 3 Jan 2010 13:55

On Mon, 21 Dec 2009 07:55:22 -0500, pete <pfiland(a)mindspring.com> wrote:

>> But this is academic. If you use Musser's hybrid Introsort, it wil lbe Quicksort with O(1) pivots and Heapsort where it blows
>> up. That guarantees O(N * lg N) performance in worst case and close to Quicksort in the average case.
>
>Outside of introsort, I'm unfamiliar with any quicksort
>which is truly O(N * log(N)) for worst case.

Then feed your sort functions the adversary from
http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

I think you will find they blow up (anything that is using a O(1) method to get a pivot).

Stephen Howe

From: Richard Harter on 3 Jan 2010 14:10

On Sun, 03 Jan 2010 18:55:14 +0000, Stephen Howe
<sjhoweATdialDOTpipexDOTcom> wrote:

>On Mon, 21 Dec 2009 07:55:22 -0500, pete <pfiland(a)mindspring.com> wrote:
>
>>> But this is academic. If you use Musser's hybrid Introsort, it wil lbe Quicksort with O(1) pivots and Heapsort where it blows
>>> up. That guarantees O(N * lg N) performance in worst case and close to Quicksort in the average case.
>>
>>Outside of introsort, I'm unfamiliar with any quicksort
>>which is truly O(N * log(N)) for worst case.
>
>Then feed your sort functions the adversary from
>http://www.cs.dartmouth.edu/~doug/mdmspe.pdf
>
>I think you will find they blow up (anything that is using a O(1) method to get a pivot).

I hope that if you reread the material you commented on you will
see the true value of your comment.

Richard Harter, cri(a)tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10
Prev: hi its a very useful blog visit this blog
Next: How to find which system call the process in?