Streaming replication and non-blocking I/O [PgSql]

Prev: Sought after architectures for the PostgreSQL buildfarm?
Next: [HACKERS] 答复: questions about concurrency control in Postgresql

From: Fujii Masao on 13 Dec 2009 22:56

On Mon, Dec 14, 2009 at 11:38 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Do we need a new "PQgetXLogData" function at all? Seems like you could
> shove the data through the COPY protocol and not have to touch libpq
> at all, rather than duplicating a nontrivial amount of code there.

Yeah, I also think that all data (the WAL data itself, its LSN and
the flag bits) which the "PQgetXLogData" handles could be shoved
through the COPY protocol. But, outside libpq, it's somewhat messy
to extract the LSN and the flag bits from the data buffer which
"PQgetCopyData" returns, by using ntohs(). So I provided the new
libpq function only for replication. That is, I didn't want to expose
the low layer of network which libpq should handle.

I think that the friendly function would be useful to implement
the standby program (e.g., a stand-alone walreceiver tool) outside
the core.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Dec 2009 09:33

Fujii Masao <masao.fujii(a)gmail.com> writes:
> On Mon, Dec 14, 2009 at 11:38 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> Do we need a new "PQgetXLogData" function at all? �Seems like you could
>> shove the data through the COPY protocol and not have to touch libpq
>> at all, rather than duplicating a nontrivial amount of code there.

> Yeah, I also think that all data (the WAL data itself, its LSN and
> the flag bits) which the "PQgetXLogData" handles could be shoved
> through the COPY protocol. But, outside libpq, it's somewhat messy
> to extract the LSN and the flag bits from the data buffer which
> "PQgetCopyData" returns, by using ntohs(). So I provided the new
> libpq function only for replication. That is, I didn't want to expose
> the low layer of network which libpq should handle.

I find that a completely unconvincing division of labor. Who is to say
that the LSN is the only part of the data that needs special treatment?

The very, very large practical problem with this is that if you decide
to change the behavior at any time, the only way to be sure that the WAL
receiver is using the right libpq version is to perform a soname major
version bump. The transformations done by libpq will essentially become
part of its ABI, and not a very visible part at that.

I am going to insist that no such logic be placed in libpq. From a
packager's standpoint that's insanity.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 14 Dec 2009 13:47

Tom Lane wrote:
> The very, very large practical problem with this is that if you decide
> to change the behavior at any time, the only way to be sure that the WAL
> receiver is using the right libpq version is to perform a soname major
> version bump. The transformations done by libpq will essentially become
> part of its ABI, and not a very visible part at that.

Not having to change the libpq API would certainly be a big advantage.

It's going to be a bit more complicated in walsender/walreceiver to work
with the libpq COPY API. We're going to need a WAL sending/receiving
protocol on top of it, defined in terms of rows and columns passed
through the COPY protocol.

One problem is the the standby is supposed to send back acknowledgments
to the master, telling it how far it has received/replayed the WAL. Is
there any way to send information back to the server, while a COPY OUT
is in progress? That's not absolutely necessary with asynchronous
replication, but will be with synchronous.

One idea is to stop/start the COPY between every batch of WAL records
sent, giving the client (= walreceiver) a chance to send messages back.
But that will lead to extra round trips.

BTW, something that's been bothering me a bit with this patch is that we
now have to link the backend with libpq. I don't see an immediate
problem with that, but I'm not a packager. Does anyone see a problem
with that?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Dec 2009 14:01

Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes:
> It's going to be a bit more complicated in walsender/walreceiver to work
> with the libpq COPY API. We're going to need a WAL sending/receiving
> protocol on top of it, defined in terms of rows and columns passed
> through the COPY protocol.

AFAIR, libpq knows essentially nothing of the data being passed through
COPY --- it just treats that as a byte stream. I think you can define
any data format you want, it doesn't need to look exactly like a COPY
of a table would. In fact it's probably a lot better if it DOESN'T
look like COPY data once it gets past libpq, so that you can check
that it is WAL and not COPY data.

> One problem is the the standby is supposed to send back acknowledgments
> to the master, telling it how far it has received/replayed the WAL. Is
> there any way to send information back to the server, while a COPY OUT
> is in progress? That's not absolutely necessary with asynchronous
> replication, but will be with synchronous.

Well, a real COPY would of course not stop to look for incoming
messages, but I don't think that's inherent in the protocol. You
would likely need some libpq adjustments so it didn't throw error
when you tried that, but it would be a small and one-time adjustment.

> BTW, something that's been bothering me a bit with this patch is that we
> now have to link the backend with libpq. I don't see an immediate
> problem with that, but I'm not a packager. Does anyone see a problem
> with that?

Yeah, I have a problem with that. What's the backend doing with libpq?
It's not receiving this data, it's sending it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Dec 2009 14:11

Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes:
> Tom Lane wrote:
>> Yeah, I have a problem with that. What's the backend doing with libpq?
>> It's not receiving this data, it's sending it.

> walreceiver is a postmaster subprocess too.

Hm. Perhaps it should be a loadable plugin and not hard-linked into the
backend? Compare dblink.

The main concern I have with hard-linking libpq is that it has a lot of
symbol conflicts with the backend --- and at least the ones from
src/port/ aren't easily removed. I foresee problems that will be very
difficult to fix on platforms where we can't filter the set of link
symbols exposed by libpq. Linking a thread-enabled libpq into the
backend could also create problems on some platforms --- it would likely
cause a thread-capable libc to get linked, which is not what we want.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: Sought after architectures for the PostgreSQL buildfarm?
Next: [HACKERS] 答复: questions about concurrency control in Postgresql