quick sort and O(nlgn) worst case bound [General Programming]

Prev: hi its a very useful blog visit this blog
Next: How to find which system call the process in?

From: pete on 6 Jan 2010 21:09

Moi wrote:
> On Thu, 07 Jan 2010 01:17:38 +0100, Alf P. Steinbach wrote:
>
>
>>* Jon Harrop:
>>
>>>Ben Bacarisse wrote:
>
>
>>>The paper does not apply to all quicksorts.
>>>
>>>
>>
>>There was a paper once by I think it was Doug McIlroy, showing how to
>>dynamically foil any quicksort, forcing it to qudratic time. :-)
>>
>
>
> Yes. A link was posted in this thread a few days ago.
> (Steven Howe, Sunday 18:55 UTC)

Yes.
It also happens to be the paper under discussion.
And as Jon Harrop has stated,
the paper does not apply to all quicksorts.

--
pete

From: pete on 12 Jan 2010 08:02

Pascal J. Bourguignon wrote:
> Daniel Pitts <newsgroup.spamfilter(a)virtualinfinity.net> writes:
>
>
>>If the pivot is chosen in O(1) time, then a data-set could cause a
>>pathological condition and quicksort's performance would degrade to
>>near n^2 time complexity, instead of n log n.
>
>
> The pivot could be choose randomly (in O(1)). However, the input set
> could still be (2 2 2 2 2 2 ... 2), so that a blind quick sort could
> still be O(n�).

That would depend on the looping style
with which quicksort was implemented.

It would be O(N*N) for Lomuto style loops, as in q_sort,
but less than N*log2(N) for Sedgewick style loops,
as in q1sort.

/* BEGIN q1sort_test.c output */

Counting compar calls for different qsort type functions,
sorting arrays of 16384 keys.

SEQUENCE is RANDOM

q_sort q1sort
Trial:
1: 134209536 212993

1 Trial, 16384 keys, 1 possible key value.

total q_sort comparisons: 134209536
total q1sort comparisons: 212993

/* END q1sort_test.c output */

/* BEGIN q1sort_test.c */

#include <stdio.h>
#include <stdlib.h>
/*
** This program compares various qsort type
** array sorting functions, by counting compar calls.
*/
#define ARRAY_ELEM (32768 / 2)
#define TRIALS 1
/*
** Make sure that SEQUENCE and SEQUENCE_STR match.
*/
#define SEQUENCE RANDOM
#define SEQUENCE_STR "RANDOM"
/*
** #define SEQUENCE RANDOM
** #define SEQUENCE FORWARD
** #define SEQUENCE REVERSE
*/
/*
** Decrease KEY_RANGE, if more equal keys are wanted.
*/
#define KEY_RANGE 0X1LU /* 0X1LU, 0XFFFFFFFFLU */
/* q_sort */ /*
** Simple Lomuto qsort
*/
/* q1sort */ /*
** Sedgewick style qsort
*/
#define FUNCTIONS_AND_STRINGS { \
{q_sort, "q_sort"}, \
{q1sort, "q1sort"} \
}
#define BYTE_SWAP(A, B, S) \
{ \
p1 = (A); \
p2 = (B); \
end = p2 + (S); \
do { \
const unsigned char swap = *p1; \
\
*p1++ = *p2; \
*p2++ = swap; \
} while (p2 != end); \
}
#define NUM_LEN 12
#define SEED_RAND 123456789
#define RANDOM (lus_seed = LU_RAND(lus_seed) % KEY_RANGE)
#define FORWARD element
#define REVERSE (0LU - 1 - element)
#define LU_RAND(S) ((S) * 69069LU + 362437 & 0XFFFFFFFFLU)
#define NMEMB(A) (sizeof A / sizeof *A)
#define str(s) # s
#define xstr(s) str(s)

void
q_sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
q1sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
void
free_arrays(long unsigned **array, size_t nmemb);
int
comparison(const void *arg1, const void *arg2);
int
comparray(long unsigned *s1, long unsigned *s2, size_t nmemb,
int (*compar)(const void *, const void *));

long unsigned Comps;

int main(void)
{
size_t sort, element, n;
long unsigned lus_seed = SEED_RAND;
const long unsigned arrar_elem = ARRAY_ELEM;
struct {
void (*func)(void *, size_t, size_t,
int (*)(const void *, const void *));
char *name;
} F[] = FUNCTIONS_AND_STRINGS;
long unsigned *array[NMEMB(F)];
long unsigned trial,
totalC[NMEMB(F)], CompCounter[NMEMB(F)];

for (n = 0; n != NMEMB(array); ++n) {
array[n] = malloc(arrar_elem * sizeof *array[n]);
if (array[n] == NULL) {
free_arrays(array, n);
fputs("\nmalloc() array failure\n", stderr);
exit(EXIT_FAILURE);
}
totalC[n] = CompCounter[n] = 0;
}
puts("/* BEGIN q1sort_test.c output */\n");
puts("Counting compar calls "
"for different qsort type functions,");
printf("sorting arrays of %lu keys.\n\n", arrar_elem);
puts("SEQUENCE is " SEQUENCE_STR "\n");
printf(" ");
for (n = 0; n != NMEMB(array); ++n) {
printf(" %" xstr(NUM_LEN) "s", F[n].name);
}
printf("\n Trial:\n");
for (trial = 0; trial != TRIALS; ++trial) {
element = arrar_elem;
while (element-- != 0) {
array[0][element] = SEQUENCE;
for (sort = 1; NMEMB(F) > sort; ++sort) {
array[sort][element] = array[0][element];
}
}
for (n = 0; n != NMEMB(array); ++n) {
Comps = 0;
F[n].func(array[n], arrar_elem
, sizeof *array[n], comparison);
CompCounter[n] = Comps;
totalC[n] += CompCounter[n];
}
printf("%6lu:", trial + 1);
for (n = 0; n != NMEMB(array); ++n) {
printf(" %" xstr(NUM_LEN) "lu", CompCounter[n]);
}
putchar('\n');
for (sort = 1; NMEMB(F) > sort; ++sort) {
if (comparray(array[0], array[sort]
, arrar_elem, comparison))
{
fputs("\nSort discrepancy!!!\n", stderr);
free_arrays(array, n);
exit(EXIT_SUCCESS);
}
}
}
free_arrays(array, NMEMB(array));
printf("\n%lu Trial%s, %lu keys, "
"%lu possible key value%s.\n\n"
, trial, trial == 1 ? "" : "s"
, arrar_elem, KEY_RANGE
, KEY_RANGE == 1 ? "" : "s");
for (n = 0; n != NMEMB(array); ++n) {
printf("total %s comparisons: %" xstr(NUM_LEN) "lu\n"
, F[n].name, totalC[n]);
}
puts("\n/* END q1sort_test.c output */");
return 0;
}

void
q_sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *))
{
unsigned char *p1, *p2, *end;
unsigned char *left;
size_t nmemb_right, middle, last, right;

left = base;
while (nmemb-- > 1) {
right = nmemb * size;
last = middle = 0;
do {
middle += size;
if (compar(left, left + middle) > 0) {
last += size;
BYTE_SWAP(left + middle, left + last, size);
}
} while (middle != right);
BYTE_SWAP(left, left + last, size);
nmemb = last / size;
nmemb_right = (right - last) / size;
if (nmemb_right > nmemb) {
q_sort(left, nmemb, size, compar);
left += last + size;
nmemb = nmemb_right;
} else {
q_sort(left + last + size, nmemb_right, size, compar);
}
}
}

void q1sort(void *base, size_t nmemb, size_t size,
int (*compar)(const void*, const void*))
{
unsigned char *left;
size_t nmemb_right, middle, last, right;
unsigned char *p1, *p2, *end;

left = base;
while (nmemb-- > 1) {
last = right = nmemb * size;
middle = size;
while (middle != last && compar(left, left + middle) > 0) {
middle += size;
}
while (compar(left + last, left) > 0) {
last -= size;
}
while (last > middle) {
BYTE_SWAP(left + middle, left + last, size);
do {
middle += size;
} while (compar(left, left + middle) > 0);
do {
last -= size;
} while (compar(left + last, left) > 0);
}
BYTE_SWAP(left, left + last, size);
nmemb = last / size;
nmemb_right = (right - last) / size;
if (nmemb_right > nmemb) {
q1sort(left, nmemb, size, compar);
left += last + size;
nmemb = nmemb_right;
} else {
q1sort(left + last + size, nmemb_right, size, compar);
}
}
}

int
comparison(const void *arg1, const void *arg2)
{
const long unsigned a = *(const long unsigned *)arg1;
const long unsigned b = *(const long unsigned *)arg2;

++Comps;
return b > a ? -1 : b != a;
}

void
free_arrays(long unsigned **array, size_t nmemb)
{
while (nmemb-- != 0) {
free(array[nmemb]);
}
}

int
comparray(long unsigned *s1, long unsigned *s2, size_t nmemb,
int (*compar)(const void *, const void *))
{
int rc = 0;

while (nmemb--) {
rc = compar(s1, s2);
if (rc != 0) {
break;
}
++s1;
++s2;
}
return rc;
}

/* END q1sort_test.c */

--
pete

From: Jon Harrop on 20 Jan 2010 12:13

Pascal J. Bourguignon wrote:
> Daniel Pitts <newsgroup.spamfilter(a)virtualinfinity.net> writes:
>> If the pivot is chosen in O(1) time, then a data-set could cause a
>> pathological condition and quicksort's performance would degrade to
>> near n^2 time complexity, instead of n log n.
>
> The pivot could be choose randomly (in O(1)). However, the input set
> could still be (2 2 2 2 2 2 ... 2), so that a blind quick sort could
> still be O(n²).

That kind of "blind" quicksort is always going to be O(n^2) if a proportion
of the input values are the same but the solution is easy: just subdivide
into xs<ys<zs and only quicksort the xs and zs.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u

From: pete on 21 Jan 2010 18:54

Jon Harrop wrote:
> Pascal J. Bourguignon wrote:
>> Daniel Pitts <newsgroup.spamfilter(a)virtualinfinity.net> writes:
>>> If the pivot is chosen in O(1) time, then a data-set could cause a
>>> pathological condition and quicksort's performance would degrade to
>>> near n^2 time complexity, instead of n log n.
>> The pivot could be choose randomly (in O(1)). However, the input set
>> could still be (2 2 2 2 2 2 ... 2), so that a blind quick sort could
>> still be O(n²).
>
> That kind of "blind" quicksort is always going to be O(n^2) if a proportion
> of the input values are the same but the solution is easy: just subdivide
> into xs<ys<zs and only quicksort the xs and zs.

I think it's much easier just to use Sedgewick style loops
and to swap array elements which compare equal to the pivot.

--
pete

From: Stephen Howe on 29 Jan 2010 10:55

>It also happens to be the paper under discussion.
>And as Jon Harrop has stated,
>the paper does not apply to all quicksorts.

Yes does the paper apply to quicksorts where the pivot selection is O(1) ?
If the pivot selection is O(N) you can gurantee to get the median every time.

I think I am saying that all quicksorts will have at least 1 worst case if the pivot selection is O(1)

Pardon me for misunderstanding you :-)

Cheers

Stephen Howe

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10
Prev: hi its a very useful blog visit this blog
Next: How to find which system call the process in?