Concurrent MERGE [PgSql]

Prev: [HACKERS] Concurrent MERGE
Next: ECPG dynamic cursor fix for UPDATE/DELETE... WHERE CURRENT OF :curname

From: "Kevin Grittner" on 5 Aug 2010 16:07

Josh Berkus <josh(a)agliodbs.com> wrote:

> Anyway, here's some of the uses I'm thinking of:
>
> (1) Pre-insert lock: you know that you're going to insert a record
> with PK="X" later in a long-running SP, so you want to lock out
> other inserts of PK="X" at the beginning of the procedure.

Well, if we added a listener, you could SELECT the desired key, and
be notified of a conflicting insert, but that's not really what
you're looking for. It does seem to me that you could solve this
one by inserting the tuple and then updating it at the end, but I
suppose you're looking to avoid the resulting dead tuple. Perhaps a
listener could be fed to a "cancel the conflicting query" routine?
In any event, the only resolution to such a conflict is to kill
something, right? And right now, a write/write conflict would occur
which would resolve it, you just want to be able to "reserve" the
slot up front, so your transaction isn't canceled after doing a
bunch of work, right?

> (2) FK Locking: you plan to modify or delete a parent FK record in
> this transaction, so you want to prevent any updates or inserts on
> its related child records. (in my experience, FK-releated
> sharelocks are the #1 cause of deadlocking).

I don't see how that can be resolved without killing something, do
you? You would just have to replace the current deadlock with some
other form of serialization failure. (And no, I will never give up
the position that a deadlock *is* one of many forms of serialization
failure.)

> (3) No-duplicate queueing: you want to create a queue table which
> doesn't accept duplicate events, but you don't want it to be a
> source of deadlocks. This is a variant of (1), but a common case.

I must be missing something. Please explain how this would work
*without* serialization failures. As far as I can see, you can
replace deadlocks with some other form, but I don't see the point.
Basically, I think we should change the deadlock SQLSTATE to '40001'
and any code which needs to deal with such things treats that
SQLSTATE as meaning "that wasn't a good time to try that
transaction, try again in a bit."

Or, if you just want it to do nothing if the row already exists,
perhaps the new MERGE code would work?

> (4) Blackouts: records of type "x" aren't supposed to be created
> during period "y to y1" or while procedure "z" is running.
> Predicate locking can be used to prevent this more easily than
> adding and removing a trigger.

I would have thought that advisory locks covered this. In what way
do they fall short for this use case?

> (5) Debugging: (variant of 4) records of type "x" keep getting
> inserted in the table, and you don't know where they're coming
> from. You can predicate lock to force an error and debug it.

Hmmmm.... Assuming fine enough granularity (like from an index for
which a range could be locked to detect the conflict) adding a
listener to the SIRead lock handling would be good for this. Well,
as long as the transactions were serializable.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Josh Berkus on 5 Aug 2010 16:37

Kevin,

Overall, you're missing the point: there are workarounds for all of
these things now. However, they are *workarounds*, which means that
they are awkward, expensive, and/or hard to administrate; having
predicate locks would make things much easier.

> I don't see how that can be resolved without killing something, do
> you? You would just have to replace the current deadlock with some
> other form of serialization failure. (And no, I will never give up
> the position that a deadlock *is* one of many forms of serialization
> failure.)

If you're in lock nowait mode, you could get back a "can't lock" error
message immediately rather than waiting for the procedure to time out.
There's certainly going to be an error regardless; it's a question of
how expensive it is for the application and the database server.
Deadlocks are *very* expensive, especially since our deadlock detector
doesn't always figure them out successfully (which means the deadlock
has to be resolved by the DBA). So any other type of serialization
failure or error is better than deadlocking.

> I must be missing something. Please explain how this would work
> *without* serialization failures. As far as I can see, you can
> replace deadlocks with some other form, but I don't see the point.

See above.

>> (4) Blackouts: records of type "x" aren't supposed to be created
>> during period "y to y1" or while procedure "z" is running.
>> Predicate locking can be used to prevent this more easily than
>> adding and removing a trigger.
>
> I would have thought that advisory locks covered this. In what way
> do they fall short for this use case?

Currently, I do use advisory locks for this case. However, they require
a fair amount of administrative design and monitoring overhead.

> Hmmmm.... Assuming fine enough granularity (like from an index for
> which a range could be locked to detect the conflict) adding a
> listener to the SIRead lock handling would be good for this. Well,
> as long as the transactions were serializable.

Yeah, it's that last caveat which makes SIRead locks not as flexible as
the theoretical predicate lock. Of course, any eventual actual
implemenation of predicate locks might be equally inflexible.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 5 Aug 2010 16:59

Josh Berkus <josh(a)agliodbs.com> wrote:

> Overall, you're missing the point: there are workarounds for all
> of these things now. However, they are *workarounds*, which means
> that they are awkward, expensive, and/or hard to administrate;
> having predicate locks would make things much easier.

Well, if some form of the SSI patch goes in most of your use cases
can be solved just by making the transactions serializable and
letting the chips fall where they may. That's the whole point of
it. I'll say it again: with true serializable transactions, if you
can show that your transaction will do the right thing if there are
no concurrent transactions, it will do the right thing in any mix of
serializable transactions or be rolled back with a serialization
failure. Full stop. No need to explicitly lock anything (with or
without NOWAIT), no need to SELECT FOR UPDATE/SHARE, no need to
"reserve" anything -- I consider all of those to be awkward
workarounds. You just systematically retry transactions which fail
with SQLSTATE '40001'. If your software isn't set up so that this
can be done once, in one place, you need to rethink your design.

I'm not at all clear how any form of predicate locking can help with
the "blackouts" example. Perhaps if you explained how you see that
working I might get it.

Oh, and if deadlocks are that broken, it's a bit scary that we have
let that go. Is it the problem that technically intractable?

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Josh Berkus on 5 Aug 2010 18:16

On 8/5/10 1:59 PM, Kevin Grittner wrote:
> Oh, and if deadlocks are that broken, it's a bit scary that we have
> let that go. Is it the problem that technically intractable?

Yes; it's a major project. Our detector works pretty well for deadlocks
which are 2-process locks or even several processes all locking against
the same first process. However, triangular and quadralateral deadlocks
(which I've seen more than once) it completely cannot handle, and some
types of activity which can cause deadlocks (like autovacuum or DDL
activity) also seem to be outside its purview. The latter is probably
fixable if I can create some good test cases.

However, the "circular" deadlock problem has an n! issue with detecting it.

Also, even where the deadlock detector does its job, it's still the most
expensive type of serialization failure:

1. the detector will wait at least 1 second to check, so we're usually
looking at a couple seconds to resolve the deadlock;
2. since deadlocks don't happen in testing, most applicaiton error
handling isn't set up for them;
3. deadlocks can, and do, result in cancelling several transactions
instead of just one; there is no "winner" which is allowed to complete.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 5 Aug 2010 18:41

Josh Berkus <josh(a)agliodbs.com> writes:
> Yes; it's a major project. Our detector works pretty well for deadlocks
> which are 2-process locks or even several processes all locking against
> the same first process. However, triangular and quadralateral deadlocks
> (which I've seen more than once) it completely cannot handle,

Hm? Please explain what you're talking about.

> and some
> types of activity which can cause deadlocks (like autovacuum or DDL
> activity) also seem to be outside its purview.

There's some known issues with deadlocks involving LWLocks as well as
regular locks, which I agree aren't fixable without some significant
rework. But I don't believe there's anything fundamentally wrong with
the deadlock detector --- the real problem there is stretching LWLocks
beyond their design intention, namely to be used only for situations
where deadlock is impossible.

> Also, even where the deadlock detector does its job, it's still the most
> expensive type of serialization failure:

Well, that's certainly true --- you don't want deadlock recovery to be
part of any high-performance path.

> 3. deadlocks can, and do, result in cancelling several transactions
> instead of just one; there is no "winner" which is allowed to complete.

Not sure I believe this either; one deadlock kills one transaction.
If you lose multiple transactions I think you had multiple deadlocks.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: [HACKERS] Concurrent MERGE
Next: ECPG dynamic cursor fix for UPDATE/DELETE... WHERE CURRENT OF :curname