lock_timeout GUC patch - Review [PgSql]

Prev: Query results differ depending on operating system (using GIN)
Next: review: psql: edit function, show function commands patch

From: Marc Cousin on 2 Aug 2010 11:27

The Monday 02 August 2010 13:59:59, Boszormenyi Zoltan wrote :
> >
> > Also, I made sure that only one or two timeout causes (one of
> > deadlock_timeout
> > and lock_timeout in the first case or statement_timeout plus one of the
> > other two)
> > can be active at a time.
>
> A little clarification is needed. The above statement is not entirely true.
> There can be a case when all three timeout causes can be active, but only
> when deadlock_timeout is the smallest of the three. If the fin_time value
> for the another timeout cause is smaller than for deadlock_timeout then
> there's no point to make deadlock_timeout active. And in case
> deadlock_timeout
> triggers and CheckDeadLock() finds that there really is a deadlock then the
> possibly active lock_timeout gets disabled to avoid calling
> RemoveFromWaitQueue() twice. I hope the comments in the code explain it
> well.
>
> > Previously I was able to trigger a segfault
> > with the default
> > 1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
> > system's
> > clock resolution makes the lock_timeout and deadlock_timeout equal and
> > RemoveFromWaitQueue() was called twice. This way it's a lot more robust.
> >
>
Ok, I've tested this new version:

This time, it's this case that doesn't work :

Session 1 : lock a table

Session 2 : set lock_timeout to a large value, just to make it obvious (10 seconds).
It times out after 1 s (the deadlock timeout), returning 'could not obtain lock'.
Of course, it should wait 10 seconds.

I really feel that the timeout framework is the way to go here. I know what
you said about it before, but I think that the current code is getting
too complicated, with too many special cases all over.

As this is only my second review and I'm sure I am missing things here, could
someone with more experience have a look and give advice ?

Cheers

Marc

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 2 Aug 2010 15:09

Marc Cousin <cousinmarc(a)gmail.com> wrote:

> This time, it's this case that doesn't work :

> I really feel that the timeout framework is the way to go here.

Since Zolt�n also seems to feel this way:

http://archives.postgresql.org/message-id/4C516C3A.6090102(a)cybertec.at

I wonder whether this patch shouldn't be rejected with a request
that the timeout framework be submitted to the next CF. Does anyone
feel this approach (without the framework) should be pursued
further?

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Boszormenyi Zoltan on 2 Aug 2010 07:59

Boszormenyi Zoltan írta:
> Marc Cousin írta:
>
>> The Thursday 29 July 2010 13:55:38, Boszormenyi Zoltan wrote :
>>
>>
>>> I fixed this by adding CheckLockTimeout() function that works like
>>> CheckStatementTimeout() and ensuring that the same start time is
>>> used for both deadlock_timeout and lock_timeout if both are active.
>>> The preference of errors if their timeout values are equal is:
>>> statement_timeout > lock_timeout > deadlock_timeout
>>>
>>>
>> As soon as lock_timeout is bigger than deadlock_timeout, it doesn't
>> work, with this new version.
>>
>> Keeping the deadlock_timeout to 1s, when lock_timeout >= 1001,
>> lock_timeout doesn't trigger anymore.
>>
>>
>
> I missed one case when the lock_timeout_active should have been set
> but the timer must not have been re-set, this caused the problem.
> I blame the hot weather and having no air conditioning. The second is
> now fixed. :-)
>
> I also added one line in autovacuum.c to disable lock_timeout,
> in case it's globally set in postgresq.conf as per Alvaro's comment.
>
> Also, I made sure that only one or two timeout causes (one of
> deadlock_timeout
> and lock_timeout in the first case or statement_timeout plus one of the
> other two)
> can be active at a time.

A little clarification is needed. The above statement is not entirely true.
There can be a case when all three timeout causes can be active, but only
when deadlock_timeout is the smallest of the three. If the fin_time value
for the another timeout cause is smaller than for deadlock_timeout then
there's no point to make deadlock_timeout active. And in case
deadlock_timeout
triggers and CheckDeadLock() finds that there really is a deadlock then the
possibly active lock_timeout gets disabled to avoid calling
RemoveFromWaitQueue() twice. I hope the comments in the code explain it
well.

> Previously I was able to trigger a segfault
> with the default
> 1sec deadlock_timeout and lock_timeout = 999 or 1001. Effectively, the
> system's
> clock resolution makes the lock_timeout and deadlock_timeout equal and
> RemoveFromWaitQueue() was called twice. This way it's a lot more robust.
>

Best regards,
Zoltán Böszörményi

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 2 Aug 2010 16:11

Robert Haas <robertmhaas(a)gmail.com> wrote:

>> I wonder whether this patch shouldn't be rejected with a request
>> that the timeout framework be submitted to the next CF.

> I think "Returned with Feedback" would be more appropriate than
> "Rejected", since we're asking for a rework, rather than saying -
> we just don't want this. But otherwise, +1.

Done.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 2 Aug 2010 16:00

Boszormenyi Zoltan <zb(a)cybertec.at> wrote:
> Kevin Grittner �rta:

>> I wonder whether this patch shouldn't be rejected with a request
>> that the timeout framework be submitted to the next CF. Does
>> anyone feel this approach (without the framework) should be
>> pursued further?
>
> I certainly think so, the current scheme seems to be very fragile
> and doesn't want to be extended.

Sorry, I phrased that question in a rather confusing way; I'm not
sure, but it sounds like you're in favor of dropping this approach
and pursuing the timeout framework in the next CF -- is that right?

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3
Prev: Query results differ depending on operating system (using GIN)
Next: review: psql: edit function, show function commands patch