From: Alex Fraser on
On 10/02/2010 16:29, Richard Harter wrote:
> On Tue, 9 Feb 2010 21:19:34 -0800 (PST), mike3
> <mike4ty4(a)yahoo.com> wrote:
>
>> Hi.
>>
>> I was wondering: Suppose we use a binary heap to keep a list of
>> objects sorted by associated "keys". How can one enforce a consistent
>> LIFO or FIFO behavior of this when dealing with multiple objects with
>> the same "key" (i.e. tie breaking)?
>
> The obvious and simple way to do this is to add a field that has
> the entry index, i.e., the first item entered has index 1, the
> second index 2, and so on. The index is your tie breaker.

I'd recommend this, though a 32-bit counter could easily be exhausted
during a run of a program. An unsigned 64-bit one should outlast the
software - I make it over 500 years at a billion insertions per second.

All the alternatives seem likely to have similar or greater space
overhead, equal or worse time complexity, and greater code complexity.

> If this won't work [...]

Am I missing a situation where it won't, or are you just not sure there
aren't any?

Alex
From: Calum on
On Feb 10, 4:29 pm, c...(a)tiac.net (Richard Harter) wrote:
> On Tue, 9 Feb 2010 21:19:34 -0800 (PST), mike3
>
> <mike4...(a)yahoo.com> wrote:
> >Hi.
>
> >I was wondering: Suppose we use a binary heap to keep a list of
> >objects sorted by associated "keys". How can one enforce a consistent
> >LIFO or FIFO behavior of this when dealing with multiple objects with
> >the same "key" (i.e. tie breaking)?
>
> The obvious and simple way to do this is to add a field that has
> the entry index, i.e., the first item entered has index 1, the
> second index 2, and so on.  The index is your tie breaker.
>
> If this won't work there is a data structure that I call a heap
> structured heap tree.  (I may have invented it - I can't find any
> references.)  A binary tree is a heap structure search tree if
>
> Definition: A min heap structured search tree (min HSST) is a
> binary tree with the following properties:
>
> 1. For each node all elements in the left subtree are less than
>    the elements in the right subtree.
> 2. The parent node is less than any of its children.
> Adopting the convention that a newly inserted element is greater
> that existing elements with the same key gives your heap.
>
> There is an article that describes them:http://home.tiac.net/~cri/2008/traversal.html

I think the crucial issue is whether the heap is stored in a binary
tree or in an array. I would by default assume that heaps are stored
in arrays.

The problem as I see it with your min HSST is that they can't be
stored in arrays efficiently, as inserting an element into the heap is
O(n), not O(ln n)

This problem is related to the heapsort algorithm, which is
"unstable", and I don't know of any way in which the heapsort
algorithm can be made stable without adding an additional index as you
describe above.

>
> The article is incorrect about performing rotations.  It is true
> that they cannot be done in the same way as in binary search
> trees but there is an equivalent operation.
>
> HTH
>
> Richard Harter, c...(a)tiac.nethttp://home.tiac.net/~cri,http://www.varinoma.com
> Infinity is one of those things that keep philosophers busy when they
> could be more profitably spending their time weeding their garden.

From: Richard Harter on
On Wed, 10 Feb 2010 21:07:21 +0000, Alex Fraser <me(a)privacy.net>
wrote:

>On 10/02/2010 16:29, Richard Harter wrote:
>> On Tue, 9 Feb 2010 21:19:34 -0800 (PST), mike3
>> <mike4ty4(a)yahoo.com> wrote:
>>
>>> Hi.
>>>
>>> I was wondering: Suppose we use a binary heap to keep a list of
>>> objects sorted by associated "keys". How can one enforce a consistent
>>> LIFO or FIFO behavior of this when dealing with multiple objects with
>>> the same "key" (i.e. tie breaking)?
>>
>> The obvious and simple way to do this is to add a field that has
>> the entry index, i.e., the first item entered has index 1, the
>> second index 2, and so on. The index is your tie breaker.
>
>I'd recommend this, though a 32-bit counter could easily be exhausted
>during a run of a program. An unsigned 64-bit one should outlast the
>software - I make it over 500 years at a billion insertions per second.

The standard trick for binary trees is to use a separate count
for each key. You don't need to maintain actual counters. The
search through the equal keys reveals the highest current counter
value for that key. However this doesn't work for heaps because
the same-key records can be scattered between branches.

What you can do is check for counter overflow and reassign the
existing count fields when there is an overflow. THe cost of
this is nominal.
>
>All the alternatives seem likely to have similar or greater space
>overhead, equal or worse time complexity, and greater code complexity.

Agreed.

>
> > If this won't work [...]
>
>Am I missing a situation where it won't, or are you just not sure there
>aren't any?

"Won't work" means not a solution given the constraints. We
don't know what the constraints are but the OP does. Frex he
might not want to provide space for a count field.


Richard Harter, cri(a)tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.
From: Richard Harter on
On Thu, 11 Feb 2010 02:57:40 -0800 (PST), Calum
<calum74(a)googlemail.com> wrote:

>On Feb 10, 4:29=A0pm, c...(a)tiac.net (Richard Harter) wrote:
>> On Tue, 9 Feb 2010 21:19:34 -0800 (PST), mike3
[snip description]

>> There is an article that describes them:http://home.tiac.net/~cri/2008/tr=
>aversal.html
>
>I think the crucial issue is whether the heap is stored in a binary
>tree or in an array. I would by default assume that heaps are stored
>in arrays.
>
>The problem as I see it with your min HSST is that they can't be
>stored in arrays efficiently, as inserting an element into the heap is
>O(n), not O(ln n)

But that isn't true. Insertion into an HSST in an array can be
done in O(ln n) time. The algorithms are similar to sifting up
and sifting down.

>
>This problem is related to the heapsort algorithm, which is
>"unstable", and I don't know of any way in which the heapsort
>algorithm can be made stable without adding an additional index as you
>describe above.

Interestingly enough, an HSST can be used for a stable inplace
worst case O(n*log(n)) sort. The plan is to begin the HSST at
the beginning of the array and move elements into it one at a
time. At the end you permute the array to change it from HSST
order to linear order. The final permutation is well defined and
is O(n). This may be a previously unpublished sort. I wouldn't
expect the performance to be great - it would have the same cache
issues as heapsort does.

I suppose one can do something similar with an ordinary BST to
keep it in an array, but it strikes me as a messier problem.

In any event, if it really matters one can maintain a stable heap
without using an index field.


>
>>
>> The article is incorrect about performing rotations. =A0It is true
>> that they cannot be done in the same way as in binary search
>> trees but there is an equivalent operation.

Hmmm, I should go back and add some more material to correct
that.


Richard Harter, cri(a)tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.
From: Alex Fraser on
On 11/02/2010 17:14, Richard Harter wrote:
> On Wed, 10 Feb 2010 21:07:21 +0000, Alex Fraser<me(a)privacy.net>
> wrote:
>> On 10/02/2010 16:29, Richard Harter wrote:
>>> On Tue, 9 Feb 2010 21:19:34 -0800 (PST), mike3
>>> <mike4ty4(a)yahoo.com> wrote:
>>>> I was wondering: Suppose we use a binary heap to keep a list of
>>>> objects sorted by associated "keys". How can one enforce a consistent
>>>> LIFO or FIFO behavior of this when dealing with multiple objects with
>>>> the same "key" (i.e. tie breaking)?
>>>
>>> The obvious and simple way to do this is to add a field that has
>>> the entry index, i.e., the first item entered has index 1, the
>>> second index 2, and so on. The index is your tie breaker.
>>
>> I'd recommend this, though a 32-bit counter could easily be exhausted
>> during a run of a program. An unsigned 64-bit one should outlast the
>> software - I make it over 500 years at a billion insertions per second.
[snip]
> What you can do is check for counter overflow and reassign the
> existing count fields when there is an overflow. THe cost of
> this is nominal.

Sorry if I'm being dense, but I'm not sure what you mean. Please can you
explain?

>> All the alternatives seem likely to have similar or greater space
>> overhead, equal or worse time complexity, and greater code complexity.
>
> Agreed.
>
>>> If this won't work [...]
>>
>> Am I missing a situation where it won't, or are you just not sure there
>> aren't any?
>
> "Won't work" means not a solution given the constraints. We
> don't know what the constraints are but the OP does. Frex he
> might not want to provide space for a count field.

The applicable set of constraints is what I meant by "situation". The
only relevant constraint I could (and still the only one I can) think of
was the additional space due to the count field, but per the comment
from my previous post, if that is the case then there may be no solution
without sacrificing time complexity.

Alex