From: Greg Stark on
On Thu, May 6, 2010 at 2:36 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> One reason I believe this isn't so critical as all that is that it only
> matters for cases where the operation on the master took an exclusive
> lock.

Uhm, or a vacuum ran. Or a HOT page cleanup occurred, or a btree page
split deleted old tuples.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on
Greg Stark wrote:
> On Thu, May 6, 2010 at 2:36 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>
>> One reason I believe this isn't so critical as all that is that it only
>> matters for cases where the operation on the master took an exclusive
>> lock.
>>
>
> Uhm, or a vacuum ran. Or a HOT page cleanup occurred, or a btree page
> split deleted old tuples.
>

Right; because there are so many regularly expected causes for query
cancellation, the proposed boolean setup really hurts the ability of a
server whose primary goal is high-availability to run queries of any
useful duration. For years I've been hearing "my HA standby is idle,
how can I put it to use?"; that's the back story of the users I thought
everyone knew were the known audience waiting for this feature.

If the UI for vacuum_defer_cleanup_age that prevented these things was
good, I would agree that the cases where max_standby_delay does
something useful are marginal. That's why I tried to get someone
working on SR to provide a hook for that purpose months ago. But since
the vacuum adjustment we have in completely obtuse xid units, that
leaves max_standby_delay as the only tunable here that you can even
think about in terms of human time.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Wed, May 5, 2010 at 9:36 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Greg Smith <greg(a)2ndquadrant.com> writes:
>> Heikki Linnakangas wrote:
>>> Let's rip out the concept of a delay altogether, and make it a boolean.
>
>> So the only user options would be "allow long-running queries to block
>> WAL application forever" and "always cancel queries on conflict?
>
> Got it in one.
>
> Obviously, this is something that would be high priority to improve in
> some fashion in 9.1.  That doesn't mean that it's reasonable to drop in
> a half-baked redesign now, nor to put in the amount of work that would
> be required to have a really well-designed implementation, and most
> certainly not to uncritically ship what we've got.

If you had a genuinely better idea for how this should work, I would
be the first to endorse it, but it's becoming clear that you don't,
which makes me also skeptical of your contention that we will be
better off with no knob at all. I find that position not very
plausible. Nor do I really see how this is backing us into any kind
of a corner. If we're really concerned that we're going to suddenly
come up with a much better method of controlling this behavior (and so
far nobody seems close to having such a brilliant insight), then let's
just put a note in the documentation saying that the setting has
problems X, Y, and Z and that if we develop a better method for
controlling this behavior, the GUC may be modified or removed in a
future release. Ripping it out seems like a drastic overreaction,
particularly considering that we're already in beta.

This feature has been in the tree since December 19th when the initial
Hot Standby patch was committed, and the last significant code change
was on February 13th. It is now May 5th. The fact that you didn't
read the patch sooner is not a reason why we should rip it out now.
Yes, the current implementation is a little crufty and has some
limitations. See also work_mem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Greg Smith wrote:
> Heikki Linnakangas wrote:
> > Let's rip out the concept of a delay altogether, and make it a boolean.
> > If you really want your query to finish, set it to -1 (using the current
> > max_standby_delay nomenclature). If recovery is important to you, set it
> > to 0.
> >
>
> So the only user options would be "allow long-running queries to block
> WAL application forever" and "always cancel queries on conflict?" That
> would be taking away the behavior I was going to suggest as the default
> to many customers I work with. I expect a non-trivial subset of people
> using this feature will set max_standby_delay to is some small number of
> minutes, similarly to how archive_timeout is sized now. Enough time to
> get reasonably sized queries executed, not so long as to allow something
> that might try to run for hours on the standby to increase failover
> catchup time very much.
>
> The way the behavior works is admittedly limited, and certainly some
> people are going to want to set it to either 0 or -1. But taking it
> away altogether is going to cripple one category of potential Hot
> Standby use in the field. Consider this for a second: do you really
> think that Simon would have waded into this coding mess, or that I would
> have spent as much energy as I have highlighting issues with its use, if
> there wasn't demand for it? If it wouldn't hurt the usefulness of
> PostgreSQL 9.0 significantly to cut it, I'd have suggested that myself
> two months ago and saved everyone (especially myself) a lot of trouble.

We are not designing in a green field here. We have released beta1 and
we are trying to get to 9.0 final in a few months. If this feature
could have been designed easily months ago, it would have been done, but
it doesn't seem to have any easy solution, and we have run out of time
to fix it. As painful as it is, we need to cut our loses and move on.

We have already cut features like sync replication and communicating the
slave snapshot to the master; I don't see how removing this ability is
any worse. We don't have time to develop this for every use case, even
if those use cases are significant.

If someone wants to suggest that HS is useless if max_standby_delay
supports only boolean values, I am ready to suggest we remove HS as well
and head to 9.0 because that would suggest that HS itself is going to be
useless.

The code will not be thrown away; we will bring it back for 9.1.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Robert Haas wrote:
> If you had a genuinely better idea for how this should work, I would
> be the first to endorse it, but it's becoming clear that you don't,
> which makes me also skeptical of your contention that we will be
> better off with no knob at all. I find that position not very
> plausible. Nor do I really see how this is backing us into any kind
> of a corner. If we're really concerned that we're going to suddenly
> come up with a much better method of controlling this behavior (and so
> far nobody seems close to having such a brilliant insight), then let's
> just put a note in the documentation saying that the setting has
> problems X, Y, and Z and that if we develop a better method for
> controlling this behavior, the GUC may be modified or removed in a
> future release. Ripping it out seems like a drastic overreaction,
> particularly considering that we're already in beta.
>
> This feature has been in the tree since December 19th when the initial
> Hot Standby patch was committed, and the last significant code change
> was on February 13th. It is now May 5th. The fact that you didn't
> read the patch sooner is not a reason why we should rip it out now.
> Yes, the current implementation is a little crufty and has some
> limitations. See also work_mem.

I am afraid the current setting is tempting for users to enable, but
will be so unpredictable that it will tarnish the repuation of HS and
Postgres. We don't want to be thinking in 9 months, "Wow, we shouldn't
have shipped that features. It is causing all kinds of problems." We
have done that before (rarely), and it isn't a good feeling.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers