Streaming replication and non-blocking I/O [PgSql]

Prev: Sought after architectures for the PostgreSQL buildfarm?
Next: [HACKERS] 答复: questions about concurrency control in Postgresql

From: Heikki Linnakangas on 14 Jan 2010 05:04

Fujii Masao wrote:
> On Wed, Jan 13, 2010 at 7:27 PM, Heikki Linnakangas
> <heikki.linnakangas(a)enterprisedb.com> wrote:
>> the frontend always puts the
>> connection to non-blocking mode, while the backend uses blocking mode.
>
> Really? By default (i.e., without the expressly setting by using
> PQsetnonblocking()), the connection is set to blocking mode even
> in frontend. Am I missing something?

That's right. The underlying socket is always put to non-blocking mode
in libpq. PQsetnonblocking() only affects whether libpq commands wait
and retry if the output buffer is full.

>> At least with SSL, I think it's possible for pq_wait() to return false
>> positives, if the SSL layer decides to renegotiate the connection
>> causing data to flow in the other direction in the underlying TCP
>> connection. A false positive would lead cause walsender to block
>> indefinitely on the pq_getbyte() call.
>
> Sorry. I could not understand that issue scenario. Could you explain
> it in more detail?

1. Walsender calls pq_wait() which calls select(), waiting for timeout,
or data to become available for reading in the underlying socket.

2. Client issues an SSL renegotiation by sending a message to the server

3. Server receives the message, and select() returns indicating that
data has arrived

4. Walsender calls HandleEndOfRep() which calls pq_getbyte().
pq_readbyte() calls SSL_read(), which receives the renegotiation message
and handles it. No application data has arrived, however, so SSL_read()
blocks for some to arrive. It never does.

I don't understand enough of SSL to know if renegotiation can actually
happen like that, but the man page of SSL_read() suggests so. But a
similar thing can happen if an SSL record is broken into two TCP
packets. select() returns immediately as the first packet arrives, but
SSL_read() will block until the 2nd packet arrives.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Magnus Hagander on 14 Jan 2010 05:09

2010/1/14 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com>:
> Fujii Masao wrote:
>> On Wed, Jan 13, 2010 at 7:27 PM, Heikki Linnakangas
>> <heikki.linnakangas(a)enterprisedb.com> wrote:
>>> the frontend always puts the
>>> connection to non-blocking mode, while the backend uses blocking mode.
>>
>> Really? By default (i.e., without the expressly setting by using
>> PQsetnonblocking()), the connection is set to blocking mode even
>> in frontend. Am I missing something?
>
> That's right. The underlying socket is always put to non-blocking mode
> in libpq. PQsetnonblocking() only affects whether libpq commands wait
> and retry if the output buffer is full.
>
>>> At least with SSL, I think it's possible for pq_wait() to return false
>>> positives, if the SSL layer decides to renegotiate the connection
>>> causing data to flow in the other direction in the underlying TCP
>>> connection. A false positive would lead cause walsender to block
>>> indefinitely on the pq_getbyte() call.
>>
>> Sorry. I could not understand that issue scenario. Could you explain
>> it in more detail?
>
> 1. Walsender calls pq_wait() which calls select(), waiting for timeout,
> or data to become available for reading in the underlying socket.
>
> 2. Client issues an SSL renegotiation by sending a message to the server
>
> 3. Server receives the message, and select() returns indicating that
> data has arrived
>
> 4. Walsender calls HandleEndOfRep() which calls pq_getbyte().
> pq_readbyte() calls SSL_read(), which receives the renegotiation message
> and handles it. No application data has arrived, however, so SSL_read()
> blocks for some to arrive. It never does.
>
> I don't understand enough of SSL to know if renegotiation can actually
> happen like that, but the man page of SSL_read() suggests so. But a
> similar thing can happen if an SSL record is broken into two TCP
> packets. select() returns immediately as the first packet arrives, but
> SSL_read() will block until the 2nd packet arrives.

I *think* renegotiation happens based on amount of content, not amount
of time. But it could still happen in cornercases I think. If the
renegotiation happens right after a complete packet has been sent
(which would be the logical place), but not fast enough that the SSL
library gets it in one read() from the socket, you could end up in
that situation. (if the SSL library gets the renegotiation request as
part of the first read(), it would probably do the renegotiation
before returning from that call to SSL_read(), in which case the
socket would be in the correct state before you call select)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 14 Jan 2010 07:46

On Thu, Jan 14, 2010 at 9:14 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> After reading up on SSL_read() and SSL_pending(), it seems that there is
> unfortunately no reliable way of checking if there is incoming data that
> can be read using SSL_read() without blocking, short of putting the
> socket to non-blocking mode. It also seems that we can't rely on poll()
> returning POLLHUP if the remote end has disconnected; it's not doing
> that at least on my laptop.
>
> So, the only solution I can see is to put the socket to non-blocking
> mode. But to keep the change localized, let's switch to non-blocking
> mode only temporarily, just when polling to see if there's data to read
> (or EOF), and switch back immediately afterwards.

Agreed. Though I also read some pages referring to that issue,
I was not able to find any better action other than the temporal
switch of the blocking mode.

> I've added a pq_getbyte_if_available() function to pqcomm.c to do that.
> The API to the upper levels is quite nice, the function returns a byte
> if one is available without blocking. Only minimal changes are required
> elsewhere.

Great! Thanks a lot!

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 15 Jan 2010 14:10

Fujii Masao wrote:
> On Wed, Jan 13, 2010 at 3:37 AM, Magnus Hagander <magnus(a)hagander.net> wrote:
>>> This change which moves walreceiver process into a dynamically loaded
>>> module caused the following compile error on my MinGW environment.
>> That sounds strange - it should pick those up from the -lpostgres. Any
>> chance you have an old postgres binary around from a non-syncrep build
>> or something?
>
> No, there is no old postgres binary.
>
>> Do you have an environment to try to build it under msvc?
>
> No, unfortunately.
>
>> in my
>> experience, that gives you easier-to-understand error messages in a
>> lot of cases like this - it removets the mingw black magic.
>
> OK. I'll try to build it under msvc.
>
> But since there seems to be a long way to go before doing that,
> I would appreciate if someone could give me some advice.

It looks like dawn_bat is experiencing the same problem. I don't think
we want to sprinkle all those variables with PGDLLIMPORT, and it didn't
fix the problem for you earlier anyway. Is there some other way to fix this?

Do people still use MinGW for any real work? Could we just drop
walreceiver support from MinGW builds?

Or maybe we should consider splitting walreceiver into two parts after
all. Only the bare minimum that needs to access libpq would go into the
shared object, and the rest would be linked with the backend as usual.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Magnus Hagander on 15 Jan 2010 14:15

2010/1/15 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com>:
> Fujii Masao wrote:
>> On Wed, Jan 13, 2010 at 3:37 AM, Magnus Hagander <magnus(a)hagander.net> wrote:
>>>> This change which moves walreceiver process into a dynamically loaded
>>>> module caused the following compile error on my MinGW environment.
>>> That sounds strange - it should pick those up from the -lpostgres. Any
>>> chance you have an old postgres binary around from a non-syncrep build
>>> or something?
>>
>> No, there is no old postgres binary.
>>
>>> Do you have an environment to try to build it under msvc?
>>
>> No, unfortunately.
>>
>>> in my
>>> experience, that gives you easier-to-understand error messages in a
>>> lot of cases like this - it removets the mingw black magic.
>>
>> OK. I'll try to build it under msvc.
>>
>> But since there seems to be a long way to go before doing that,
>> I would appreciate if someone could give me some advice.
>
> It looks like dawn_bat is experiencing the same problem. I don't think
> we want to sprinkle all those variables with PGDLLIMPORT, and it didn't
> fix the problem for you earlier anyway. Is there some other way to fix this?
>
> Do people still use MinGW for any real work? Could we just drop
> walreceiver support from MinGW builds?

We don't know if this works on MSVC, because MSVC doesn't actually try
to build the walreceiver. I'm going to look at that tomorrow.

If we get the same issues there, we a problem in our code. If not, we
need to figure out what's up with mingw.

> Or maybe we should consider splitting walreceiver into two parts after
> all. Only the bare minimum that needs to access libpq would go into the
> shared object, and the rest would be linked with the backend as usual.

That would certainly be one option.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: Sought after architectures for the PostgreSQL buildfarm?
Next: [HACKERS] 答复: questions about concurrency control in Postgresql