From: Jan Wieck on
On 6/4/2010 12:52 PM, Alvaro Herrera wrote:
> Excerpts from Jan Wieck's message of jue jun 03 19:52:19 -0400 2010:
>> On 6/3/2010 7:11 PM, Alvaro Herrera wrote:
>
>> > Why not send separate numbers of tuple inserts/updates/deletes, which we
>> > already have from pgstats?
>>
>> We only have them for the entire database. The purpose of this is just a
>> guesstimate about what data volume to expect if I were to select all log
>> from a particular transaction.
>
> But we already have per table counters. Couldn't we aggregate them per
> transaction as well, if this feature is enabled? I'm guessing that this
> is going to have some uses besides Slony; vague measurements could turn
> out to be unusable for some of these.

We have them per table and per index, summarized over all transactions.
It is debatable if bloating this feature with detailed statistics is
useful or not, but I'd rather not have that bloat at the beginning,
because otherwise I know exactly what is going to happen. People will
just come back and say "zero impact my a..".


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Marko Kreen on
On 6/4/10, Robert Haas <robertmhaas(a)gmail.com> wrote:
> On Fri, Jun 4, 2010 at 10:44 AM, Greg Stark <gsstark(a)mit.edu> wrote:
> > A function which takes a starting xid and a number of transactions to
> > return seems very tied to one particular application. I could easily
> > see other systems such as a multi-master system instead only wanting
> > to compare two transactions to find out which committed first. Or
> > non-replication applications where you have an LSN and want to know
> > whether a given transaction had committed by that time.
> >
> > So one possible interface would be to do something like
> > xids_committed_between(lsn_start, lsn_end) -- and yes, possibly with
> > an optional argument to limit the number or records returned.
>
>
> I'm imagining that the backend data storage for this would be a file
> containing, essentially, a struct for each commit repeated over and
> over again, packed tightly. It's easy to index into such a file using
> a sequence number (give me the 1000'th commit) but searching by LSN
> would require (a) storing the LSNs and (b) binary search. Maybe it's
> worth adding that complexity, but I'm not sure that it is. Keeping
> the size of this file small is important for ensuring that it has
> minimal performance impact (which is also why I'm not sold on trying
> to include the tuple counters that Jan proposed - I think we can solve
> the problem he's worried about there more cleanly in other ways).

AIUI, you index the file by offset.

> >> I think
> >> we should be very careful about assuming that we understand
> >> replication and its needs better than someone who has spent many years
> >> developing one of the major PostgreSQL replication solutions.
> >
> > Well the flip side of that is that we want an interface that's useful
> > for more than just one replication system. This is something basic
> > enough that I think it will be useful for more than just replication
> > if we design it generally enough. It should be useful for
> > backup/restore processes and monitoring as well as various forms of
> > replication including master-slave trigger based systems but also
> > including PITR-based replication, log-parsing systems, multi-master
> > trigger based systems, 2PC-based systems, etc.
>
>
> Making it general enough to serve multiple needs is good, but we've
> got to make sure that the extra complexity is buying us something.
> Jan seems pretty confident that this could be used by Londiste also,
> though it would be nice to have some confirmation from the Londiste
> developer(s) on that. I think it may also have applications for
> distributed transactions and multi-master replication, but I am not
> too sure it helps much for PITR-based replication or log-parsing
> systems. We want to design something that is good, but trying to
> solve too many problems may end up solving none of them well.

The potential for single shared queue implementation, with
the additional potential for merging async replication
implementations sounds attractive. (Merging ~ having
single one that satisfies broad range of needs.)

Unless the functionality accepted into core will be limited
to replication only and/or performs worse than current
snapshot-based grouping. Then it is uninteresting, of course.

Jan's proposal of storing small struct into segmented files
sounds like it could work. Can't say anything more because
I can't imagine it as well as Jan. Would need to play with
working implementation to say more...

--
marko

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alvaro Herrera on
Excerpts from Marko Kreen's message of jue jun 10 18:10:50 -0400 2010:

> Jan's proposal of storing small struct into segmented files
> sounds like it could work. Can't say anything more because
> I can't imagine it as well as Jan. Would need to play with
> working implementation to say more...

We already have such a thing -- see pg_multixact

--
Álvaro Herrera <alvherre(a)commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers