testing HS/SR - 1 vs 2 performance [PgSql]

Prev: pg_ctl stop -m immediate on the primary server inflatessequences
Next: [HACKERS] non-reproducible failure of random test on HEAD

From: Tom Lane on 27 Apr 2010 14:53

Hmm ... there's another point here, which is that the array size creates
a hard maximum on the number of entries, whereas the hash table was a
bit more forgiving. What is the proof that the array won't overflow?
The fact that the equivalent data structure on the master can't hold
more than this many entries doesn't seem to me to prove that, because
we will add intermediate not-observed XIDs to the array.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 27 Apr 2010 15:29

On Tue, 2010-04-27 at 14:53 -0400, Tom Lane wrote:
> Hmm ... there's another point here, which is that the array size
> creates
> a hard maximum on the number of entries, whereas the hash table was a
> bit more forgiving. What is the proof that the array won't overflow?
> The fact that the equivalent data structure on the master can't hold
> more than this many entries doesn't seem to me to prove that, because
> we will add intermediate not-observed XIDs to the array.

We know that not-observed xids have actually been allocated on the
primary. We log an assignment record every 64 subtransactions, so that
the peak size of the array is 65 xids per connection.

It's possible for xids to stay in the array for longer, in the event of
a FATAL error that doesn't log an abort record. We clean those up every
checkpoint, if they exist. The potential number of them is unbounded, so
making special allowance for them doesn't remove the theoretical risk.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 27 Apr 2010 17:24

Isn't the snapshotOldestActiveXid filter in
RecordKnownAssignedTransactionIds completely wrong/useless/bogus?

AFAICS, snapshotOldestActiveXid is only set once at the start of
recovery. This means it will soon be too old to provide any useful
filtering. But what's far worse is that the XID space will eventually
wrap around, and that test will start filtering *everything*.

I think we should just lose that test, as well as the variable.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Erik Rijkers" on 4 May 2010 12:10

Hi Simon,

In another thread you mentioned you were lacking information from me:

On Tue, May 4, 2010 17:10, Simon Riggs wrote:
>
> There is no evidence that Erik's strange performance has anything to do
> with HS; it hasn't been seen elsewhere and he didn't respond to
> questions about the test setup to provide background. The profile didn't
> fit any software problem I can see.
>

I'm sorry if I missed requests for things that where not already mentioned.

Let me repeat:
OS: Centos 5.4
2 quadcores: Intel(R) Xeon(R) CPU X5482 @ 3.20GHz
Areca 1280ML
primary and standby db both on a 12 disk array (sata 7200rpm, Seagat Barracuda ES.2)

It goes without saying (I hope) that apart from the pgbench tests
and a few ssh sessions (myself), the machine was idle.

It would be interesting if anyone repeated these simple tests and produced
evidence that these non-HS.

(Unfortunately, I have at the moment not much time for more testing)

thanks,

Erik Rijkers

On Sun, April 25, 2010 21:07, Simon Riggs wrote:
> On Sun, 2010-04-25 at 20:25 +0200, Erik Rijkers wrote:
>
>> Sorry if it's too much data, but to me at least it was illuminating;
>> I now understand the effects of the different parameters better.
>
> That's great, many thanks.
>
> A few observations
>
> * Standby performance is actually slightly above normal running. This is
> credible because of the way snapshots are now taken. We don't need to
> scan the procarray looking for write transactions, since we know
> everything is read only. So we scan just the knownassignedxids, which if
> no activity from primary will be zero-length, so snapshots will actually
> get taken much faster in this case on standby. The snapshot performance
> on standby is O(n) where n is the number of write transactions
> "currently" on primary (transfer delays blur the word "currently").
>
> * The results for scale factor < 100 are fine, and the results for >100
> with few connections get thrown out by long transaction times. With
> larger numbers of connections the wait problems seem to go away. Looks
> like Erik (and possibly Hot Standby in general) has an I/O problem,
> though "from what" is not yet determined. It could be just hardware, or
> might be hardware plus other factors.
>
> --
> Simon Riggs www.2ndQuadrant.com
>
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 4 May 2010 12:19

On Tue, 2010-05-04 at 18:10 +0200, Erik Rijkers wrote:
> It would be interesting if anyone repeated these simple tests and
> produced
> evidence that these non-HS.
>
> (Unfortunately, I have at the moment not much time for more testing)

Would you be able to make those systems available for further testing?

First, I'd perform the same test with the systems swapped, so we know
more about the symmetry of the issue. After that, would like to look
more into internals.

Is it possible to setup SytemTap and dtrace on these systems?

Thanks, either way.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 9 10 11 12 13 14 15 16 17 18 19 20 21
Prev: pg_ctl stop -m immediate on the primary server inflatessequences
Next: [HACKERS] non-reproducible failure of random test on HEAD