From: Fujii Masao on 10 Jun 2010 22:26 On Fri, Jun 11, 2010 at 1:48 AM, Josh Berkus <josh(a)agliodbs.com> wrote: > On 06/09/2010 07:36 PM, Mark Kirkwood wrote: >> >> On 10/06/10 14:07, Tatsuo Ishii wrote: >>> >>> The one of top 3 questions I got >>> when we propose them our HA solution is, "how long will it take to >>> do failover when the master DB crashes?" >>> >> >> Same here +1 > > In that case, wouldn't they set max_standby_delay to 0? �In which case the > failover problem goes away, no? Yes, but I guess they'd also like to run read only queries on the standby. Setting max_standby_delay to 0 would prevent them from doing that because the conflict with the replay of the VACUUM or HOT record would often happen. vacuum_defer_cleanup_age would be helpful for that case, but it seems to be hard to tune that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Bruce Momjian on 1 Jul 2010 23:14 Fujii Masao wrote: > On Thu, Jun 10, 2010 at 5:06 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > > Josh Berkus <josh(a)agliodbs.com> writes: > >> The fact that failover current does *not* terminate existing queries and > >> transactions was regarded as a feature by the audience, rather than a > >> bug, when I did demos of HS/SR. ?Of course, they might not have been > >> thinking of the delay for writes. > > > >> If there were an easy way to make the trigger file cancel all running > >> queries, apply remaining logs and come up, then I'd vote for that for > >> 9.0. ?I think it's the more desired behavior by most users. ?However, > >> I'm opposed to any complex solutions which might delay 9.0 release. > > > > My feeling about it is that if you want fast failover you should not > > have your failover target server configured as hot standby at all, let > > alone hot standby with a long max_standby_delay. ?Such a slave could be > > very far behind on applying WAL when the crunch comes, and no amount of > > query killing will save you from that. ?Put your long-running standby > > queries on a different slave instead. > > > > We should consider whether we can improve the situation in 9.1, but it > > is not a must-fix for 9.0; especially when the correct behavior isn't > > immediately obvious. > > OK. Let's revisit in 9.1. > > I attached the proposal patch for 9.1. The patch treats max_standby_delay > as zero (i.e., cancels all the conflicting queries immediately), ever since > the trigger file is created. So we can cause a recovery to end without > waiting for any lock held by queries, and minimize the failover time. > OTOH, queries which don't conflict with a recovery survive the failover. Should this be added to the first 9.1 commitfest? -- Bruce Momjian <bruce(a)momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 2 Jul 2010 00:13 Bruce Momjian <bruce(a)momjian.us> writes: > Fujii Masao wrote: >> On Thu, Jun 10, 2010 at 5:06 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >>> My feeling about it is that if you want fast failover you should not >>> have your failover target server configured as hot standby at all, let >>> alone hot standby with a long max_standby_delay. Such a slave could be >>> very far behind on applying WAL when the crunch comes, and no amount of >>> query killing will save you from that. Put your long-running standby >>> queries on a different slave instead. >>> >>> We should consider whether we can improve the situation in 9.1, but it >>> is not a must-fix for 9.0; especially when the correct behavior isn't >>> immediately obvious. >> OK. Let's revisit in 9.1. >> >> I attached the proposal patch for 9.1. The patch treats max_standby_delay >> as zero (i.e., cancels all the conflicting queries immediately), ever since >> the trigger file is created. So we can cause a recovery to end without >> waiting for any lock held by queries, and minimize the failover time. >> OTOH, queries which don't conflict with a recovery survive the failover. > Should this be added to the first 9.1 commitfest? Not sure ... it seems like proof of concept for a pretty dubious concept. If you want a slave to be ready for fast failover then you should not be letting it get far behind the master in the first place. I think there's some missing piece here, but I'm not quite sure what to propose. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Pages: 1 2 3 4 Prev: [HACKERS] failover vs. read only queries Next: [HACKERS] walwriter not closing old files |