From: Tom Lane on 28 Jan 2010 13:05 Simon Riggs <simon(a)2ndQuadrant.com> writes: > On Thu, 2010-01-28 at 11:41 -0500, Tom Lane wrote: >> FWIW, I don't agree with that prioritization in the least. Cascading >> is something we could leave till 9.1, or even later, and > Not what you said just a few days ago. Me? I don't recall having said a word about cascading before. > I'm a little worried the feature set of streaming rep isn't any better > than what we have already. Nonsense. Getting rid of the WAL-segment-based shipping delays is a quantum improvement --- it means we actually have something approaching real-time replication, which was really impractical before. Whether you can feed slaves indirectly is just a minor administration detail. Yeah, I know in some situations it could be helpful for performance, but it's not even in the same ballpark of must-have-ness. (Anyway, the argument that it's important for performance is pure speculation AFAIK, untainted by any actual measurements. Given the lack of optimization of WAL replay, it seems entirely possible that the last thing you want to burden a slave with is sourcing data to more slaves.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 28 Jan 2010 13:29 On Thu, 2010-01-28 at 13:05 -0500, Tom Lane wrote: > Simon Riggs <simon(a)2ndQuadrant.com> writes: > > On Thu, 2010-01-28 at 11:41 -0500, Tom Lane wrote: > >> FWIW, I don't agree with that prioritization in the least. Cascading > >> is something we could leave till 9.1, or even later, and > > > Not what you said just a few days ago. > > Me? I don't recall having said a word about cascading before. Top of this thread. > > I'm a little worried the feature set of streaming rep isn't any better > > than what we have already. > > Nonsense. Getting rid of the WAL-segment-based shipping delays is a > quantum improvement --- it means we actually have something approaching > real-time replication, which was really impractical before. Whether you > can feed slaves indirectly is just a minor administration detail. Yeah, > I know in some situations it could be helpful for performance, but > it's not even in the same ballpark of must-have-ness. FWIW, streaming has been possible and actively used since 8.2. > (Anyway, the argument that it's important for performance is pure > speculation AFAIK, untainted by any actual measurements. Given the lack > of optimization of WAL replay, it seems entirely possible that the last > thing you want to burden a slave with is sourcing data to more slaves.) Separate processes, separate CPUs, no problem. If WAL replay used more CPUs you might be right, but it doesn't yet, so same argument opposite conclusion. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 28 Jan 2010 13:37 Tom Lane wrote: > (Anyway, the argument that it's important for performance is pure > speculation AFAIK, untainted by any actual measurements. Given the lack > of optimization of WAL replay, it seems entirely possible that the last > thing you want to burden a slave with is sourcing data to more slaves.) > On any typical production hardware, the work of WAL replay is going to leave at least one (and probably more) CPUs idle, and have plenty of network resources to spare too because it's just shuffling WAL in/out rather than dealing with so many complicated client conversations. And the thing you want to redistribute--the WAL file--is practically guaranteed to be sitting in the OS cache at the point where you'd be doing it, so no disk use either. You'll disrupt a little bit of memory/CPU cache, sure, but that's about it as far as leeching resources from the main replay in order to support the secondary slave. I'll measure it fully the next time I have one setup to give some hard numbers, I've never seen it rise to the point where it was worth worrying about before to bother. Anyway, I think what Simon was trying to suggest was that it's possible right now to ship partial WAL files over as they advance, if you monitor pg_xlogfile_name_offset and are willing to coordinate copying chunks over. That basic idea is even built already--the Skytools walmgr deals with partial WALs for example. Having all that built-into the server with a nicer UI is awesome, but it's been possible to build something with the same basic feature set since 8.2. Getting that going with a chain of downstreams slaves is not so easy though, so there's something that I think would be unique to the 9.0 implementation. -- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 28 Jan 2010 13:40 Tom Lane wrote: > Fujii Masao <masao.fujii(a)gmail.com> writes: >> How about just making a restore_command copy the WAL files as the >> normal one (e.g., 0000...) instead of a pg_xlog/RECOVERYXLOG? >> Though we need to worry about deleting them, we can easily leave >> the task to the bgwriter. > > The reason for doing it that way was to limit disk space usage during > a long restore. I'm not convinced we can leave the task to the bgwriter > --- it shouldn't be deleting anything at that point. That has been changed already. In standby mode, bgwriter does delete old WAL files when it performs a restartpoint. Otherwise the streamed WAL files will keep accumulating and eventually fill the disk. It works as it is, but having a sandbox dedicated for restored/streamed files in pg_xlog/restored, instead of messing with pg_xlog directly, would make me feel a bit easier about it. There's less potential for damage in case of bugs if they're separate. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Joshua D. Drake" on 28 Jan 2010 13:41
On Thu, 2010-01-28 at 13:05 -0500, Tom Lane wrote: > > I'm a little worried the feature set of streaming rep isn't any better > > than what we have already. > > Nonsense. Getting rid of the WAL-segment-based shipping delays is a > quantum improvement --- it means we actually have something approaching > real-time replication, which was really impractical before. SR does not give us anything like replication. Replication implies an ability to read from the Slave. That is HS only territory. >From what I read on the wiki SR doesn't give us anything that PostgreSQL + PITRTools doesn't already give us. And PITR Tools works as far back as 8.1 (although I would suggest 8.2+). One thing I am unclear on, is if with SR the entire log must be written before it streams to the slaves. If the entire log does not need to be written, then that is one up on PITRTools in that we have to wait for archive_command to execute. > (Anyway, the argument that it's important for performance is pure > speculation AFAIK, untainted by any actual measurements. Given the lack > of optimization of WAL replay, it seems entirely possible that the last > thing you want to burden a slave with is sourcing data to more slaves.) > I agree. WAL replay as a whole is a bottlekneck. As it stands now (I don't know about 8.5), replay is a large bottleneck on keeping the warm-standby up to date. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |