From: Fujii Masao on 17 Jun 2010 03:20 On Thu, Jun 17, 2010 at 4:02 PM, Rafael Martinez <r.m.guerrero(a)usit.uio.no> wrote: > I tested this yesterday and I could not get any reaction from the wal > receiver even after using minimal values compared to the default values �. > > The default values in linux for tcp_keepalive_time, tcp_keepalive_intvl > and tcp_keepalive_probes are 7200, 75 and 9. I reduced these values to > 60, 3, 3 and nothing happened, it continuous with status ESTABLISHED > after 60+3*3 seconds. > > I did not restart the network after I changed these values on the fly > via /proc. I wonder if this is the reason the connection didn't die > neither with the new keppalive values after the connection was broken. I > will check this later today. Walreceiver uses libpq to communicate with the master. But keepalive is not enabled in libpq currently. That is libpq code doesn't call something like setsockopt(SOL_SOCKET, SO_KEEPALIVE). So even if you change the kernel options for keepalive, it has no effect on walreceiver. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Magnus Hagander on 17 Jun 2010 04:08 On Thu, Jun 17, 2010 at 09:20, Fujii Masao <masao.fujii(a)gmail.com> wrote: > On Thu, Jun 17, 2010 at 4:02 PM, Rafael Martinez > <r.m.guerrero(a)usit.uio.no> wrote: >> I tested this yesterday and I could not get any reaction from the wal >> receiver even after using minimal values compared to the default values �. >> >> The default values in linux for tcp_keepalive_time, tcp_keepalive_intvl >> and tcp_keepalive_probes are 7200, 75 and 9. I reduced these values to >> 60, 3, 3 and nothing happened, it continuous with status ESTABLISHED >> after 60+3*3 seconds. >> >> I did not restart the network after I changed these values on the fly >> via /proc. I wonder if this is the reason the connection didn't die >> neither with the new keppalive values after the connection was broken. I >> will check this later today. > > Walreceiver uses libpq to communicate with the master. But keepalive is not > enabled in libpq currently. That is libpq code doesn't call something like > setsockopt(SOL_SOCKET, SO_KEEPALIVE). So even if you change the kernel options > for keepalive, it has no effect on walreceiver. Yeah, there was a patch submitted for this - I think it's on the CF page for 9.1... I guess if we really need it walreceiver could enable it - just get the socket with PQsocket(). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 17 Jun 2010 12:43 Fujii Masao <masao.fujii(a)gmail.com> writes: > On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> The real problem here is that we're sending records to the slave which >> might cease to exist on the master if it unexpectedly reboots. �I >> believe that what we need to do is make sure that the master only >> sends WAL it has already fsync'd (Tom suggested on another thread that >> this might be necessary, and I think it's now clear that it is 100% >> necessary). > The attached patch changes walsender so that it always sends WAL up to > LogwrtResult.Flush instead of LogwrtResult.Write. Applied, along with some minor comment improvements of my own. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Pages: 1 2 3 4 5 Prev: streaming replication breaks horribly if mastercrashes Next: ANNOUNCE list |