From: Selena Deckelmann on
Hi!

On Mon, Jan 25, 2010 at 1:34 PM, Greg Smith <greg(a)2ndquadrant.com> wrote:
> Josh Berkus wrote:
>>
>> We discussed this issue at LCA where I encountered these bogus error
>> messages when I was doing the demo of HS.  I consider Selena's patch to
>> be a bug-fix for beta of 9.0, not a feature.  Currently the database
>> reports a lot of false error messages when running in standby mode, and
>> once we have 1000's more users using standby mode, we're going to see a
>> lot of related confusion.
>>
>
> Does anyone have a complete list of the false error messages and what
> context they show up in so that a proper test case could be constructed?

They aren't "false" technically. They are the result of the function
call attempting to copy files that do not exist. It's not a big deal
functionality-wise, but it retries a few times. The stat call "fixes"
it. I could do a bit more there with the error result, but didn't.

I can scan through the code tonight and look for other cases where
this might be an issue. The main thing I'm looking for is to
distinguish between harmful and non-harmful errors.

> I extracted some pg_standby changes from Simon last week that have some
> overlap with Selena's patch (better logging, remove bogus link feature,
> throw less false error messages out).  I'm not quite ready to submit
> anything here just yet, I'm going to break that into more targeted patches,
> but I will be later this week.  I share the concern here that some of these
> issues are annoying enough to be considered bugs, and I need to fix them
> regardless of whether anybody else does.

The stat issue is one of those issues for users that makes them think:
"this looks like an error, and i've never done this before. maybe
there is SOMETHING WRONG!"

I included the progress/logging/verbosity changes so that the errors
were still generated but were definitely flagged as 'debugging' and
'probably not an issue'. :)

> I'd be happy to work with Selena
> as a review pair here, to knock out the worst of the problems on this

Sweet. I, too, would love to work with you to get this fancied/cleaned up.

> program, now that the use-case for it should be more popular.  pg_standby
> could use a bit of an upgrade based on the rough edges found by all its
> field tests, most of which is in error handling and logging.  I don't have
> anything like her stat check in what I'm working on, so there's certainly
> useful stuff uniquely in each patch.

Thanks!

>> * Could we just re-use '-l' for logging?
>
> The patch I'm working on adds "-v verbosity" so that logging can be a bit
> more fine-grained than that even.  Having both debug and a progress report
> boolean can then get folded into a single verbosity level, rather than
> maintain two similar paths.  Just make debug equal to the highest verbosity
> and maybe start deprecating that switch altogether.
>
> One reason I'm not quite ready to submit what I've got yet is that I want to
> unify things better here.  I think that I'd prefer to use the same
> terminology as log_min_messages for the various options, and make a macro
> wrapper like ELOG this code uses instead of all these terrible direct
> fprintf([stderr|stdout]... calls.

Yes, a wrapper is desperately needed with timestamps.

>> * Is there a way to get a non-module to use the ereport/elog system?
>
> And that work would make this transition easier to make, too, if it became
> feasible.  I fear that's outside of the scope of what anyone wants to touch
> at this point though.

Sure thing. I scanned what was in contrib and didn't see anything I
could crib in there. Was just throwing it out there if someone had
already done it.

-selena

--
http://chesnok.com/daily - me
http://endpoint.com - work

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Greg Smith <greg(a)2ndquadrant.com> writes:
> [ Greg and Selena discuss filing some rough edges off pg_standby ]

Maybe I'm missing something, but I thought pg_standby would be mostly
dead once SR hits the streets. Is it worth spending lots of time on?

The ideas all sound good, I'm just wondering if it's useful effort
at this point.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Jan 26, 2010 at 9:08 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> Just committed a fix: the server no longer requests 0000000001.history
> at start of archive recovery.

Good.

And I think that writeTimeLineHistory() should also skip the request
of 0000000001.history. Here is the patch to do so. Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From: Magnus Hagander on
2010/1/26 Tom Lane <tgl(a)sss.pgh.pa.us>:
> Greg Smith <greg(a)2ndquadrant.com> writes:
>> [ Greg and Selena discuss filing some rough edges off pg_standby ]
>
> Maybe I'm missing something, but I thought pg_standby would be mostly
> dead once SR hits the streets.  Is it worth spending lots of time on?
>
> The ideas all sound good, I'm just wondering if it's useful effort
> at this point.

I think there are definite use-cases for pg_standby as well, even when
we have SR. SR requires you to have a reasonably reliable network
connection that lets you do an arbitrary TCP connection. There are a
lot of scenarios that could still use the
"here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
transfer, and that don't necessarily care that there is a longer
delay.

*Most* people will still use SR, I'm sure.


--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Magnus Hagander on
2010/1/26 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com>:
> Magnus Hagander wrote:
>> I think there are definite use-cases for pg_standby as well, even when
>> we have SR. SR requires you to have a reasonably reliable network
>> connection that lets you do an arbitrary TCP connection. There are a
>> lot of scenarios that could still use the
>> "here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
>> transfer, and that don't necessarily care that there is a longer
>> delay.
>
> With the changes to the retry-logic that were discussed (see
> http://archives.postgresql.org/message-id/4B5758ED.1060703(a)enterprisedb.com,
> I intend to commit that tomorrow), if standby_mode=on, the server will
> keep retrying to restore the next segment using restore_command until
> it's found, or the trigger file is found.
>
> *That* makes pg_standby obsolete, not streaming replication per se.
> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
> no connection info for walreceiver is more or less the same as using
> pg_standby.

Ah, ok, missed that. So it basically folds pg_standby into the
backend. In *that* case, I can see how pg_standby would be obsolete.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers