From: Simon Riggs on
On Wed, 2010-01-20 at 11:04 +0100, Andres Freund wrote:
> On Wednesday 20 January 2010 10:52:24 Simon Riggs wrote:
> > On Wed, 2010-01-20 at 10:45 +0100, Andres Freund wrote:
> > > LWLockAcquire
> >
> > I'm using spinlocks, not lwlocks.
> CancelDBBackends which is used in SendRecoveryConflictWithBufferPin which in
> turn used by CheckStandbyTimeout triggered by SIGALRM acquires the lwlock.

Those are used in similar ways to deadlock detection.

> Now that case is a bit less dangerous because you would have to interrupt
> yourself to trigger a deadlock there because the code sleeps soon after
> setting up the handler.
> If ever two SIGALRM occur consecutive there is a problem.

I'll protect against subsequent calls.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andres Freund on
On Wednesday 20 January 2010 11:33:05 Simon Riggs wrote:
> On Wed, 2010-01-20 at 11:04 +0100, Andres Freund wrote:
> > On Wednesday 20 January 2010 10:52:24 Simon Riggs wrote:
> > > On Wed, 2010-01-20 at 10:45 +0100, Andres Freund wrote:
> > > > LWLockAcquire
> > >
> > > I'm using spinlocks, not lwlocks.
> >
> > CancelDBBackends which is used in SendRecoveryConflictWithBufferPin which
> > in turn used by CheckStandbyTimeout triggered by SIGALRM acquires the
> > lwlock.
>
> Those are used in similar ways to deadlock detection.
But only if
ImmediateInterruptOK && InterruptHoldoffCount == 0 && CritSectionCount == 0 -
which is not the case with HoldingBufferPinThatDelaysRecovery.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Andres Freund <andres(a)anarazel.de> writes:
> On Wednesday 20 January 2010 06:30:28 Tom Lane wrote:
>> Er ... what? I believe there are live platforms with sig_atomic_t = char.
>> If we're assuming more that's a must-fix.

> The reason I have asked is that the code is doing things like:
> [ grabbing a spinlock to read a single integer ]

Yes, I think we probably actually need that. The problem is not so
much whether the read is an atomic operation as whether you can rely
on getting an up-to-date value. On multiprocessors with weak memory
ordering you need some type of "sync" instruction to be sure you will
see a value that was recently written by another processor. Currently,
we embed such instructions in the spinlock acquire/release code.
There's been some discussion of exposing memory sync independently
of lock acquisition; perhaps that would be enough here, but I haven't
looked at the surrounding logic enough to say.

My complaint at the top was responding to the idea that someone might
be supposing the specific type sig_atomic_t was at least as wide as
int. That's a different matter altogether. We do assume in some places
that we can read or write the specific type TransactionId indivisibly,
but we don't try to declare it as sig_atomic_t.

> or similar things with LWLockAcquire in a signal handler

[ grows visibly pale ] *Please* tell me we are not trying to take
locks in a signal handler. What happens if it interrupts code that
is already holding that lock?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andres Freund on
On Wednesday 20 January 2010 17:30:04 Tom Lane wrote:
> Andres Freund <andres(a)anarazel.de> writes:
> > On Wednesday 20 January 2010 06:30:28 Tom Lane wrote:
> >> Er ... what? I believe there are live platforms with sig_atomic_t =
> >> char. If we're assuming more that's a must-fix.
> >
> > The reason I have asked is that the code is doing things like:
> > [ grabbing a spinlock to read a single integer ]
>
> Yes, I think we probably actually need that. The problem is not so
> much whether the read is an atomic operation as whether you can rely
> on getting an up-to-date value. On multiprocessors with weak memory
> ordering you need some type of "sync" instruction to be sure you will
> see a value that was recently written by another processor. Currently,
> we embed such instructions in the spinlock acquire/release code.
> There's been some discussion of exposing memory sync independently
> of lock acquisition; perhaps that would be enough here, but I haven't
> looked at the surrounding logic enough to say.
I think it should be enough.

I realize its way too late in the cycle for that, but why dont we start using
some library for easy cross platform atomic ops? I think libatomic or such
should support the required platforms.

> My complaint at the top was responding to the idea that someone might
> be supposing the specific type sig_atomic_t was at least as wide as
> int. That's a different matter altogether. We do assume in some places
> that we can read or write the specific type TransactionId indivisibly,
> but we don't try to declare it as sig_atomic_t.
So we already assume that? Fine.

(yes, the sig_atomic_t was a sidetrack - I had memorized it wrongly as "the
biggest value that can be read/written atomically which is *clearly* wrong)

> > or similar things with LWLockAcquire in a signal handler
>
> [ grows visibly pale ] *Please* tell me we are not trying to take
> locks in a signal handler. What happens if it interrupts code that
> is already holding that lock?
Yes the patch does that at two places. Thats what I was complaining about and
what triggered my sig_atomic_t question because of the above explained
misunderstanding.


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Andres Freund <andres(a)anarazel.de> writes:
> I realize its way too late in the cycle for that, but why dont we start using
> some library for easy cross platform atomic ops?

(1) there probably isn't one that does exactly what we want, works
everywhere, and has the right license;
(2) what actual gain would we get? We've already done the work.

>> [ grows visibly pale ] *Please* tell me we are not trying to take
>> locks in a signal handler. What happens if it interrupts code that
>> is already holding that lock?

> Yes the patch does that at two places.

That's a must-fix.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers