From: Bruce Momjian on
Jan Wieck wrote:
> >> Reading the entire WAL just to find all COMMIT records, then go back to
> >> the origin database to get the actual replication log you're looking for
> >> is simpler and more efficient? I don't think so.
> >
> > Agreed, but I think I've not explained myself well enough.
> >
> > I proposed two completely separate ideas; the first one was this:
> >
> > If you must get commit order, get it from WAL on *origin*, using exact
> > same code that current WALSender provides, plus some logic to read
> > through the WAL records and extract commit/aborts. That seems much
> > simpler than the proposal you outlined and as SR shows, its low latency
> > as well since commits write to WAL. No need to generate event ticks
> > either, just use XLogRecPtrs as WALSender already does.
> >
> > I see no problem with integrating that into core, technically or
> > philosophically.
> >
>
> Which means that if I want to allow a consumer of that commit order data
> to go offline for three days or so to replicate the 5 requested, low
> volume tables, the origin needs to hang on to the entire WAL log from
> all 100 other high volume tables?

I suggest writing an external tool that strips out what you need that
can be run at any time, rather than creating a new data format and
overhead for this usecase.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On May 28, 2010, at 7:19 PM, Bruce Momjian <bruce(a)momjian.us> wrote:
> Jan Wieck wrote:
>>>> Reading the entire WAL just to find all COMMIT records, then go
>>>> back to
>>>> the origin database to get the actual replication log you're
>>>> looking for
>>>> is simpler and more efficient? I don't think so.
>>>
>>> Agreed, but I think I've not explained myself well enough.
>>>
>>> I proposed two completely separate ideas; the first one was this:
>>>
>>> If you must get commit order, get it from WAL on *origin*, using
>>> exact
>>> same code that current WALSender provides, plus some logic to read
>>> through the WAL records and extract commit/aborts. That seems much
>>> simpler than the proposal you outlined and as SR shows, its low
>>> latency
>>> as well since commits write to WAL. No need to generate event ticks
>>> either, just use XLogRecPtrs as WALSender already does.
>>>
>>> I see no problem with integrating that into core, technically or
>>> philosophically.
>>>
>>
>> Which means that if I want to allow a consumer of that commit order
>> data
>> to go offline for three days or so to replicate the 5 requested, low
>> volume tables, the origin needs to hang on to the entire WAL log from
>> all 100 other high volume tables?
>
> I suggest writing an external tool that strips out what you need that
> can be run at any time, rather than creating a new data format and
> overhead for this usecase.

That would be FAR more complex, less robust, and less performant -
whereas doing what Jan has proposed is pretty straightforward and
should have minimal impact on performance - or none when not enabled.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on
On 5/28/2010 7:19 PM, Bruce Momjian wrote:
> Jan Wieck wrote:
>> >> Reading the entire WAL just to find all COMMIT records, then go back to
>> >> the origin database to get the actual replication log you're looking for
>> >> is simpler and more efficient? I don't think so.
>> >
>> > Agreed, but I think I've not explained myself well enough.
>> >
>> > I proposed two completely separate ideas; the first one was this:
>> >
>> > If you must get commit order, get it from WAL on *origin*, using exact
>> > same code that current WALSender provides, plus some logic to read
>> > through the WAL records and extract commit/aborts. That seems much
>> > simpler than the proposal you outlined and as SR shows, its low latency
>> > as well since commits write to WAL. No need to generate event ticks
>> > either, just use XLogRecPtrs as WALSender already does.
>> >
>> > I see no problem with integrating that into core, technically or
>> > philosophically.
>> >
>>
>> Which means that if I want to allow a consumer of that commit order data
>> to go offline for three days or so to replicate the 5 requested, low
>> volume tables, the origin needs to hang on to the entire WAL log from
>> all 100 other high volume tables?
>
> I suggest writing an external tool that strips out what you need that
> can be run at any time, rather than creating a new data format and
> overhead for this usecase.
>

Stripping it out from what?


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Jan Wieck wrote:
> >> > I see no problem with integrating that into core, technically or
> >> > philosophically.
> >> >
> >>
> >> Which means that if I want to allow a consumer of that commit order data
> >> to go offline for three days or so to replicate the 5 requested, low
> >> volume tables, the origin needs to hang on to the entire WAL log from
> >> all 100 other high volume tables?
> >
> > I suggest writing an external tool that strips out what you need that
> > can be run at any time, rather than creating a new data format and
> > overhead for this usecase.
> >
>
> Stripping it out from what?

Stripping it from the WAL. Your system seems to require double-writes
on a commit, which is something we have avoided in the past.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on
On 6/1/2010 11:09 AM, Bruce Momjian wrote:
> Jan Wieck wrote:
>> >> > I see no problem with integrating that into core, technically or
>> >> > philosophically.
>> >> >
>> >>
>> >> Which means that if I want to allow a consumer of that commit order data
>> >> to go offline for three days or so to replicate the 5 requested, low
>> >> volume tables, the origin needs to hang on to the entire WAL log from
>> >> all 100 other high volume tables?
>> >
>> > I suggest writing an external tool that strips out what you need that
>> > can be run at any time, rather than creating a new data format and
>> > overhead for this usecase.
>> >
>>
>> Stripping it out from what?
>
> Stripping it from the WAL. Your system seems to require double-writes
> on a commit, which is something we have avoided in the past.
>

Your suggestion seems is based on several false assumptions. This does
neither require additional physical writes on commit, nor is consuming
the entire WAL just to filter out commit records anything even remotely
desirable for systems like Londiste or Slony.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers