From: Tom Lane on
I wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> Could the desired behavior be obtained using a CTE?

> Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
> deal with this without some sort of semantic changes.

.... although on reflection, I'm not sure *why* we push FOR UPDATE into
WITHs. That seems a bit antithetical to the position we've evolved that
WITH queries are executed independently of the outer query.

If we removed that bit of behavior, which hopefully is too new for much
code to depend on, then the old FOR-UPDATE-last behavior could be
attained via a WITH. And we'd not have to risk touching the interaction
between plain subqueries and FOR UPDATE, which is something that seems
much more likely to break existing apps.

That seems like a reasonable compromise to me ... any objections?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Stark on
On Sun, Oct 25, 2009 at 7:34 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> All that we have to do to fix the first one is to put the LockRows node
> below the Limit node instead of above it.  The solution for the second
> one is to also put LockRows underneath the Sort node, and to regard its
> output as unsorted so that a Sort node will certainly be generated.
> (This in turn implies that we should prefer the cheapest-total plan
> for the rest of the query.)

I'm not following how this would work. Would it mean that every record
would end up being locked?


--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Greg Stark <gsstark(a)mit.edu> writes:
> On Sun, Oct 25, 2009 at 7:34 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> All that we have to do to fix the first one is to put the LockRows node
>> below the Limit node instead of above it. �The solution for the second
>> one is to also put LockRows underneath the Sort node, and to regard its
>> output as unsorted so that a Sort node will certainly be generated.
>> (This in turn implies that we should prefer the cheapest-total plan
>> for the rest of the query.)

> I'm not following how this would work. Would it mean that every record
> would end up being locked?

In the case of LIMIT, no, because LIMIT doesn't fetch any more rows than
it needs from its input node. In the case of ORDER BY, yes,
potentially. So we might conceivably decide we should fix the first
issue and not the second. However, I'd prefer to have a solution
whereby the query does what it appears to mean and you have to write
something more complicated if you want performance over correctness.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
I wrote:
>> Robert Haas <robertmhaas(a)gmail.com> writes:
>>> Could the desired behavior be obtained using a CTE?

>> Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
>> deal with this without some sort of semantic changes.

> ... although on reflection, I'm not sure *why* we push FOR UPDATE into
> WITHs. That seems a bit antithetical to the position we've evolved that
> WITH queries are executed independently of the outer query.

> If we removed that bit of behavior, which hopefully is too new for much
> code to depend on, then the old FOR-UPDATE-last behavior could be
> attained via a WITH. And we'd not have to risk touching the interaction
> between plain subqueries and FOR UPDATE, which is something that seems
> much more likely to break existing apps.

On further investigation, this still seems like a good change to make,
but it doesn't solve the problem at hand. If we make FOR UPDATE not
propagate into WITH then the effect of

with w as (select ... order by x limit n) select * from w for update

is not going to be to lock just the rows pulled from the WITH; it's
going to be to not lock any rows at all. So we're still up against a
performance problem if we make these otherwise-desirable changes in plan
node order.

I realized just now that the backwards-compatibility problem is not
nearly as bad as I thought it was, because a lot of the syntaxes
we might want to change the behavior of will fail outright in 8.4
and before. We had this little bit in preptlist.c:

/*
* Currently the executor only supports FOR UPDATE/SHARE at top level
*/
if (root->query_level > 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("SELECT FOR UPDATE/SHARE is not allowed in subqueries")));

and that is the error you will get if you have FOR UPDATE in or applying
to a sub-select that does not get flattened into the calling query.
So in particular, a sub-select using ORDER BY or LIMIT has never been
compatible with FOR UPDATE at all, and that means we can define the
behavior of that combination freely.

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT. The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance. This also seems reasonably non-astonishing
from a semantic viewpoint.

Actually implementing this will be more than a one-line change, but it
doesn't seem too terribly difficult --- we'll just need to modify the
query representation of FOR UPDATE enough that we can remember which
case we had.

Comments?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Tue, Oct 27, 2009 at 11:22 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> I wrote:
>>> Robert Haas <robertmhaas(a)gmail.com> writes:
>>>> Could the desired behavior be obtained using a CTE?
>
>>> Nope, we push FOR UPDATE into WITHs too.  I don't really see any way to
>>> deal with this without some sort of semantic changes.
>
>> ... although on reflection, I'm not sure *why* we push FOR UPDATE into
>> WITHs.  That seems a bit antithetical to the position we've evolved that
>> WITH queries are executed independently of the outer query.
>
>> If we removed that bit of behavior, which hopefully is too new for much
>> code to depend on, then the old FOR-UPDATE-last behavior could be
>> attained via a WITH.  And we'd not have to risk touching the interaction
>> between plain subqueries and FOR UPDATE, which is something that seems
>> much more likely to break existing apps.
>
> On further investigation, this still seems like a good change to make,
> but it doesn't solve the problem at hand.  If we make FOR UPDATE not
> propagate into WITH then the effect of
>
> with w as (select ... order by x limit n) select * from w for update
>
> is not going to be to lock just the rows pulled from the WITH; it's
> going to be to not lock any rows at all.  So we're still up against a
> performance problem if we make these otherwise-desirable changes in plan
> node order.
>
> I realized just now that the backwards-compatibility problem is not
> nearly as bad as I thought it was, because a lot of the syntaxes
> we might want to change the behavior of will fail outright in 8.4
> and before.  We had this little bit in preptlist.c:
>
>        /*
>         * Currently the executor only supports FOR UPDATE/SHARE at top level
>         */
>        if (root->query_level > 1)
>            ereport(ERROR,
>                    (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>            errmsg("SELECT FOR UPDATE/SHARE is not allowed in subqueries")));
>
> and that is the error you will get if you have FOR UPDATE in or applying
> to a sub-select that does not get flattened into the calling query.
> So in particular, a sub-select using ORDER BY or LIMIT has never been
> compatible with FOR UPDATE at all, and that means we can define the
> behavior of that combination freely.
>
> What I am thinking we should do is define that FOR UPDATE happens before
> ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
> an outer query level, it happens after the sub-select's ORDER BY or
> LIMIT.  The first provision fixes the bugs noted in our documentation,
> and the second one allows people to get back the old behavior if they
> need it for performance.  This also seems reasonably non-astonishing
> from a semantic viewpoint.
>
> Actually implementing this will be more than a one-line change, but it
> doesn't seem too terribly difficult --- we'll just need to modify the
> query representation of FOR UPDATE enough that we can remember which
> case we had.

When you refer to an "outer query level", is that the same thing as a
sub-select? If so, I think I agree that the behavior is
non-astonishing.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers