testing HS/SR - 1 vs 2 performance [PgSql]

Prev: pg_ctl stop -m immediate on the primary server inflatessequences
Next: [HACKERS] non-reproducible failure of random test on HEAD

From: Simon Riggs on 17 Apr 2010 10:16

On Fri, 2010-04-16 at 11:10 -0400, Tom Lane wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
> > On Fri, 2010-04-16 at 10:39 -0400, Tom Lane wrote:
> >> I think you're outsmarting yourself there. A binary search will in fact
> >> *not work* with circular xid comparison (this is exactly why there's no
> >> btree opclass for XID).
>
> > I don't understand the exact, please explain more.
> > I'm not using bsearch() just a quick chop based upon xid comparison,
> > which looks to me like it will work.
>
> Implementing it yourself doesn't get you out of the fact that it won't
> work. Consider
>
> 1 2 3 ... 3000000000 ... 3999999999
>
> and suppose that 3000000000 is the array's middle element. If you
> search for 100, your first probe will conclude that it is to the right
> of 3000000000, which is the wrong direction. Binary search, or indeed
> the entire concept that the array is "sorted" in the first place,
> depends on a transitive comparison operator. TransactionIdFollows does
> not satisfy the transitive law.

AFAICS the example you give isn't correct.

We would lay out the values like this

W-3 W-2 W-1 W 3 4 5

where W is the wrap value and in this example we have 7 values in the
current array, with tail at W-3 and head at 5. Note the gap between W
and 3 where we would skip the values 1 and 2 because they are special.
Each element's xid is TransactionIdAdvanced(previous element).

So when we search for value 3 we would start from W, then decide it is
to the right, which is correct and continue from there.

The values are laid out in TransactionIdFollows order, not in numeric
order, hence we need to use TransactionIdFollows to decide which way to
branch.

As long as it works I'm not worried if the array is not technically
"sorted".

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 17 Apr 2010 11:13

Simon Riggs <simon(a)2ndQuadrant.com> writes:
> AFAICS the example you give isn't correct.

> We would lay out the values like this

> W-3 W-2 W-1 W 3 4 5

> where W is the wrap value

Stop right there, you're already failing to think clearly. There is no
unique "wrap value", all values act the same in circular XID space.

The fundamental problem here is that if the set of XIDs involved spans a
sufficiently long range, your array will have a[0] < a[1] < ... < a[n]
but it will fail to be true that a[0] < a[n]. If that's also true for,
say, a[0] vs the midpoint element, then a binary search for a[0] will
fail because it will make the wrong decision while probing the midpoint.

> The values are laid out in TransactionIdFollows order,

They *cannot* be "laid out in TransactionIdFollows order". It's not
possible, because that relationship isn't a total ordering.

Now it might be the case that this is OK for HS purposes because the set
of XIDs that are relevant at any one instant should never span more than
half of the XID space. But I'd just as soon not use that assumption
here if it's unnecessary. It'd be cheaper anyway to sort and search the
array using plain <, so why are you so eager to use
TransactionIdFollows?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 17 Apr 2010 11:25

On Sat, Apr 17, 2010 at 11:13 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
>> AFAICS the example you give isn't correct.
>
>> We would lay out the values like this
>
>> W-3 W-2 W-1 W 3 4 5
>
>> where W is the wrap value
>
> Stop right there, you're already failing to think clearly. There is no
> unique "wrap value", all values act the same in circular XID space.

Or to put it a different way, what happens when the "wrap value"
changes? An array that was in order under one wrap value can cease to
be in order under another.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 17 Apr 2010 15:20

On Sat, 2010-04-17 at 11:13 -0400, Tom Lane wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
> > AFAICS the example you give isn't correct.
>
> > We would lay out the values like this
>
> > W-3 W-2 W-1 W 3 4 5
>
> > where W is the wrap value
>
> Stop right there, you're already failing to think clearly. There is no
> unique "wrap value", all values act the same in circular XID space.
>
> The fundamental problem here is that if the set of XIDs involved spans a
> sufficiently long range, your array will have a[0] < a[1] < ... < a[n]
> but it will fail to be true that a[0] < a[n]. If that's also true for,
> say, a[0] vs the midpoint element, then a binary search for a[0] will
> fail because it will make the wrong decision while probing the midpoint.

I understand that the xids are circular, but there is only one point
where the next xid has a lower integer value but is still "later" from
the perspective of TransactionIdFollows. Apart from that single
discontinuity all other values are monotonic. From your insistence I
presume I must be missing something, but I just don't see it. Perhaps we
are just misunderstanding each other about the "sufficiently long
range".

> > The values are laid out in TransactionIdFollows order,
>
> They *cannot* be "laid out in TransactionIdFollows order". It's not
> possible, because that relationship isn't a total ordering.
>
> Now it might be the case that this is OK for HS purposes because the set
> of XIDs that are relevant at any one instant should never span more than
> half of the XID space.

Agree that is true.

> But I'd just as soon not use that assumption
> here if it's unnecessary.

I understand what your saying. I think it is necessary here.

> It'd be cheaper anyway to sort and search the
> array using plain <, so why are you so eager to use
> TransactionIdFollows?

The array grows to the right and is laid out one xid per element,
resulting in a sequence of values that are transactionid-monotonic. As
the values increase there will eventually reach the discontinuity where
they cease being normally monotonic. Doing it this way means that we can
add rows past the head of the array and then move the head atomically,
so that we can make adding xids lock-free.

If we try to actually sort the values then the algorithm is both more
complex and requires locking. It would be easier to just remember where
the discontinuity is and act accordingly.

So I'm not eager to use either way, but I only currently see one way
that would work.

If there's a different way, that gives the same or better algorithmic
characteristics, I will be happy to code it.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 17 Apr 2010 15:45

Simon Riggs <simon(a)2ndQuadrant.com> writes:
> On Sat, 2010-04-17 at 11:13 -0400, Tom Lane wrote:
>> It'd be cheaper anyway to sort and search the
>> array using plain <, so why are you so eager to use
>> TransactionIdFollows?

> The array grows to the right and is laid out one xid per element,
> resulting in a sequence of values that are transactionid-monotonic.

How do you know that just adding items at the right will produce a
sorted array? It seems quite unlikely to me that transactions can be
guaranteed to arrive at this code in XID order. I think you need to do
an explicitly sorted insertion.

> ... Doing it this way means that we can
> add rows past the head of the array and then move the head atomically,
> so that we can make adding xids lock-free.

.... and even without that issue, this seems like utter fantasy. How
are you going to do that "atomically"? Have you considered what will
happen on weak-memory-ordering machines like PPC, in particular?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Prev: pg_ctl stop -m immediate on the primary server inflatessequences
Next: [HACKERS] non-reproducible failure of random test on HEAD