Hot Standby query cancellation and Streaming Replication integration [PgSql]

Prev: Why isn't stats_temp_directory automatically created?
Next: Hot Standby query cancellation and Streaming Replicationintegration

From: Greg Stark on 26 Feb 2010 07:10

On Fri, Feb 26, 2010 at 8:33 AM, Greg Smith <greg(a)2ndquadrant.com> wrote:
>
> I'm not sure what you might be expecting from the above combination, but
> what actually happens is that many of the SELECT statements on the table
> *that isn't even being updated* are canceled. You see this in the logs:

Well I proposed that the default should be to wait forever when
applying WAL logs that conflict with a query. Precisely because I
think the expectation is that things will "just work" and queries not
fail unpredictably. Perhaps in your test a larger max_standby_delay
would have prevented the cancellations but then as soon as you try a
query which lasts longer you would have to raise it again. There's no
safe value which will be right for everyone.

> If you're running a system that also is using Streaming Replication, there
> is a much better approach possible.

So I think one of the main advantages of a log shipping system over
the trigger-based systems is precisely that it doesn't require the
master to do anything it wasn't doing already. There's nothing the
slave can do which can interfere with the master's normal operation.

This independence is really a huge feature. It means you can allow
users on the slave that you would never let near the master. The
master can continue running production query traffic while users run
all kinds of crazy queries on the slave and drive it into the ground
and the master will continue on blithely unaware that anything's
changed.

In the model you describe any long-lived queries on the slave cause
tables in the master to bloat with dead records.

I think this model is on the roadmap but it's not appropriate for
everyone and I think one of the benefits of having delayed it is that
it forces us to get the independent model right before throwing in
extra complications. It would be too easy to rely on the slave
feedback as an answer for hard questions about usability if we had it
and just ignore the question of what to do when it's not the right
solution for the user.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 26 Feb 2010 12:45

On Fri, Feb 26, 2010 at 10:21 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Richard Huxton wrote:
>> Can we not wait to cancel the transaction until *any* new lock is
>> attempted though? That should protect all the single-statement
>> long-running transactions that are already underway. Aggregates etc.
>
> Hmm, that's an interesting thought. You'll still need to somehow tell
> the victim backend "you have to fail if you try to acquire any more
> locks", but a single per-backend flag in the procarray would suffice.
>
> You could also clear the flag whenever you free the last snapshot in the
> transaction (ie. between each query in read committed mode).

Wow, that seems like it would help a lot. Although I'm not 100% sure
I follow all the details of how this works.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Stark on 26 Feb 2010 12:46

On Fri, Feb 26, 2010 at 4:43 PM, Richard Huxton <dev(a)archonet.com> wrote:
> Let's see if I've got the concepts clear here, and hopefully my thinking it
> through will help others reading the archives.
>
> There are two queues:

I don't see two queues. I only see the one queue of operations which
have been executed on the master but not replayed yet on the slave.
Every write operation on the master enqueues an operation to it and
every operation replayed on the slave dequeues from it. Only a subset
of operations create conflicts with concurrent transactions on the
slave, namely vacuums and a few similar operations (HOT pruning and
btree index pruning).

There's no question we need to make sure users have good tools to
monitor this queue and are aware of these tools. You can query each
slave for its currently replayed log position and hopefully you can
find out how long it's been delayed (ie, if it's looking at a log
record and waiting for a conflict to clear how long ago that log
record was generated). You can also find out what the log position is
on the master.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 26 Feb 2010 13:53

Greg Stark <gsstark(a)mit.edu> writes:
> In the model you describe any long-lived queries on the slave cause
> tables in the master to bloat with dead records.

Yup, same as they would do on the master.

> I think this model is on the roadmap but it's not appropriate for
> everyone and I think one of the benefits of having delayed it is that
> it forces us to get the independent model right before throwing in
> extra complications. It would be too easy to rely on the slave
> feedback as an answer for hard questions about usability if we had it
> and just ignore the question of what to do when it's not the right
> solution for the user.

I'm going to make an unvarnished assertion here. I believe that the
notion of synchronizing the WAL stream against slave queries is
fundamentally wrong and we will never be able to make it work.
The information needed isn't available in the log stream and can't be
made available without very large additions (and consequent performance
penalties). As we start getting actual beta testing we are going to
uncover all sorts of missed cases that are not going to be fixable
without piling additional ugly kluges on top of the ones Simon has
already crammed into the system. Performance and reliability will both
suffer.

I think that what we are going to have to do before we can ship 9.0
is rip all of that stuff out and replace it with the sort of closed-loop
synchronization Greg Smith is pushing. It will probably be several
months before everyone is forced to accept that, which is why 9.0 is
not going to ship this year.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 26 Feb 2010 14:16

Josh Berkus <josh(a)agliodbs.com> writes:
> On 2/26/10 10:53 AM, Tom Lane wrote:
>> I think that what we are going to have to do before we can ship 9.0
>> is rip all of that stuff out and replace it with the sort of closed-loop
>> synchronization Greg Smith is pushing. It will probably be several
>> months before everyone is forced to accept that, which is why 9.0 is
>> not going to ship this year.

> I don't think that publishing visibility info back to the master ... and
> subsequently burdening the master substantially for each additional
> slave ... are what most users want.

I don't see a "substantial additional burden" there. What I would
imagine is needed is that the slave transmits a single number back
--- its current oldest xmin --- and the walsender process publishes
that number as its transaction xmin in its PGPROC entry on the master.

I don't doubt that this approach will have its own gotchas that we
find as we get into it. But it looks soluble. I have no faith in
either the correctness or the usability of the approach currently
being pursued.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5 6 7
Prev: Why isn't stats_temp_directory automatically created?
Next: Hot Standby query cancellation and Streaming Replicationintegration