From: Simon Riggs on
On Fri, 2010-04-23 at 19:07 -0400, Robert Haas wrote:
> On Fri, Apr 23, 2010 at 6:39 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> > On Fri, 2010-04-23 at 11:32 -0400, Robert Haas wrote:
> >> >
> >> > 99% of transactions happen in similar times between primary and standby,
> >> > everything dragged down by rare but severe spikes.
> >> >
> >> > We're looking for something that would delay something that normally
> >> > takes <0.1ms into something that takes >100ms, yet does eventually
> >> > return. That looks like a severe resource contention issue.
> >>
> >> Wow. Good detective work.
> >
> > While we haven't fully established the source of those problems, I am
> > now happy that these test results don't present any reason to avoid
> > commiting the main patch tested by Erik (not the smaller additional one
> > I sent). I expect to commit that on Sunday.
>
> Both Heikki and I objected to that patch.

Please explain your objection, based upon the patch and my explanations.

> And apparently it doesn't
> fix the problem, either. So, -1 from me.

There is an issue observed in Erik's later tests, but my interpretation
of the results so far is that the sorted array patch successfully
removes the initially reported loss of performance.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Thu, 2010-04-22 at 23:45 +0100, Simon Riggs wrote:
> On Thu, 2010-04-22 at 20:39 +0200, Erik Rijkers wrote:
> > On Sun, April 18, 2010 13:01, Simon Riggs wrote:
>
> > any comment is welcome...
>
> Please can you re-run with -l and post me the file of times

Erik has sent me details of a test run. My analysis of that is:

I'm seeing the response time profile on the standby as
99% <110us
99.9% <639us
99.99% <615ms

0.052% (52 samples) are >5ms elapsed and account for 24 s, which is
about 45% of elapsed time.

Of the 52 samples >5ms, 50 of them are >100ms and 2 >1s.

99% of transactions happen in similar times between primary and standby,
everything dragged down by rare but severe spikes.

We're looking for something that would delay something that normally
takes <0.1ms into something that takes >100ms, yet does eventually
return. That looks like a severe resource contention issue.

This effect happens when running just a single read-only session on
standby from pgbench. No confirmation as yet as to whether recovery is
active or dormant, and what other activitity if any occurs on standby
server at same time. So no other clues as yet as to what the contention
might be, except that we note the standby is writing data and the
database is large.

> Please also rebuild using --enable-profile so we can see what's
> happening.
>
> Can you also try the enclosed patch which implements prefetching during
> replay of btree delete records. (Need to set effective_io_concurrency)

As yet, no confirmation that the attached patch is even relevant. It was
just a wild guess at some tuning, while we wait for further info.

> Thanks for your further help.

"Some kind of contention" is best we can say at present.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Fri, Apr 23, 2010 at 11:14 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Thu, 2010-04-22 at 23:45 +0100, Simon Riggs wrote:
>> On Thu, 2010-04-22 at 20:39 +0200, Erik Rijkers wrote:
>> > On Sun, April 18, 2010 13:01, Simon Riggs wrote:
>>
>> > any comment is welcome...
>>
>> Please can you re-run with -l and post me the file of times
>
> Erik has sent me details of a test run. My analysis of that is:
>
> I'm seeing the response time profile on the standby as
> 99% <110us
> 99.9% <639us
> 99.99% <615ms
>
> 0.052% (52 samples) are >5ms elapsed and account for 24 s, which is
> about 45% of elapsed time.
>
> Of the 52 samples >5ms, 50 of them are >100ms and 2 >1s.
>
> 99% of transactions happen in similar times between primary and standby,
> everything dragged down by rare but severe spikes.
>
> We're looking for something that would delay something that normally
> takes <0.1ms into something that takes >100ms, yet does eventually
> return. That looks like a severe resource contention issue.

Wow. Good detective work.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Marko Kreen on
On 4/18/10, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Sat, 2010-04-17 at 16:48 -0400, Tom Lane wrote:
> > There are some places where we suppose that a *single* write into shared
> > memory can safely be done without a lock, if we're not too concerned
> > about how soon other transactions will see the effects. But what you
> > are proposing here requires more than one related write.
> >
> > I've been burnt by this myself:
> > http://archives.postgresql.org/pgsql-committers/2008-06/msg00228.php
>
>
> W O W - thank you for sharing.
>
> What I'm not clear on is why you've used a spinlock everywhere when only
> weak-memory thang CPUs are a problem. Why not have a weak-memory-protect
> macro that does does nada when the hardware already protects us? (i.e. a
> spinlock only for the hardware that needs it).

Um, you have been burned by exactly this on x86 also:

http://archives.postgresql.org/pgsql-hackers/2009-03/msg01265.php

--
marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Marko Kreen <markokr(a)gmail.com> writes:
> Um, you have been burned by exactly this on x86 also:
> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01265.php

Yeah, we never did figure out exactly how come you were observing that
failure on Intel-ish hardware. I was under the impression that Intel
machines didn't have weak-memory-ordering behavior.

I wonder whether your compiler had rearranged the code in ProcArrayAdd
so that the increment happened before the array element store at the
machine-code level. I think it would be entitled to do that under
standard C semantics, since that ProcArrayStruct pointer isn't marked
volatile.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers