From: Tatsuo Ishii on 9 Jun 2010 04:56 > When the trigger file is created while the recovery keeps > waiting for the release of the lock by read only queries, > it might take a very long time for the standby to become > the master. The recovery cannot go ahead until those read > only queries have gone away. This would increase the downtime > at the failover, and degrade the high availability. > > To fix the problem, when the trigger file is found, I think > that we should cancel all the running read only queries > immediately (or forcibly use -1 as the max_standby_delay > since that point) and make the recovery go ahead. If some > people prefer queries over failover even when they create the > trigger file, we can make the trigger behavior selectable in > response to the content of the trigger file like pg_standby > does. > > This problem looks like a bug, so I'd like to fix that for > 9.0. But the amount of code change might not be small. > Thought? +1. Down time of HA system is really important for HA users. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Takahiro Itagaki on 9 Jun 2010 05:13 Fujii Masao <masao.fujii(a)gmail.com> wrote: > To fix the problem, when the trigger file is found, I think > that we should cancel all the running read only queries > immediately (or forcibly use -1 as the max_standby_delay > since that point) and make the recovery go ahead. Hmmm, does the following sequence work as your expect instead of the chanage? It requires text-file manipulation in 1, but seems to be more flexible. 1. Reset max_standby_delay = 0 in postgresql.conf 2. pg_ctl reload 3. Create a trigger file BTW, I hope we will have "pg_ctl failover --timeout=N" in 9.1 instead of the trigger file based management. Regards, --- Takahiro Itagaki NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 9 Jun 2010 05:31 On Wed, Jun 9, 2010 at 5:47 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote: > To fix the problem, when the trigger file is found, I think > that we should cancel all the running read only queries > immediately (or forcibly use -1 as the max_standby_delay > since that point) and make the recovery go ahead. Oops! I made an error. I meant 0 instead of -1, as the max_standby_delay. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 9 Jun 2010 05:49 On Wed, Jun 9, 2010 at 6:13 PM, Takahiro Itagaki <itagaki.takahiro(a)oss.ntt.co.jp> wrote: >> To fix the problem, when the trigger file is found, I think >> that we should cancel all the running read only queries >> immediately (or forcibly use -1 as the max_standby_delay >> since that point) and make the recovery go ahead. > > Hmmm, does the following sequence work as your expect instead of the chanage? > It requires text-file manipulation in 1, but seems to be more flexible. > > �1. Reset max_standby_delay = 0 in postgresql.conf > �2. pg_ctl reload > �3. Create a trigger file As far as I read the HS code, SIGHUP is not checked while a recovery is waiting for queries :( So pg_ctl reload would have no effect on the conflicting queries. Independently from the problem I raised, I think that we should call HandleStartupProcInterrupts() in that sleep loop. > BTW, I hope we will have "pg_ctl failover --timeout=N" in 9.1 > instead of the trigger file based management. Please feel free to try that ;) Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 9 Jun 2010 10:41 Fujii Masao <masao.fujii(a)gmail.com> writes: > When the trigger file is created while the recovery keeps > waiting for the release of the lock by read only queries, > it might take a very long time for the standby to become > the master. The recovery cannot go ahead until those read > only queries have gone away. This would increase the downtime > at the failover, and degrade the high availability. > To fix the problem, when the trigger file is found, I think > that we should cancel all the running read only queries > immediately (or forcibly use -1 as the max_standby_delay > since that point) and make the recovery go ahead. If some > people prefer queries over failover even when they create the > trigger file, we can make the trigger behavior selectable in > response to the content of the trigger file like pg_standby > does. > This problem looks like a bug, so I'd like to fix that for > 9.0. But the amount of code change might not be small. > Thought? -1. This looks like 9.1 material to me, and besides I'm not even convinced that what you propose is a good solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Next
|
Last
Pages: 1 2 3 4 Prev: [HACKERS] failover vs. read only queries Next: [HACKERS] walwriter not closing old files |