From: Fujii Masao on
On Wed, Feb 17, 2010 at 6:00 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> On Wed, Feb 17, 2010 at 4:07 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>> On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander <magnus(a)hagander.net> wrote:
>>> In that case, O_DIRECT would be counterproductive, no? It maps to
>>> FILE_FLAG_NOI_BUFFERING, which makes sure it doesn't go into the
>>> cache. So the read in the startup proc is actually guaranteed to
>>> reuqire a physical read - of something we just wrote, so it'll almost
>>> certainly end up waiting for a rotation, no?
>>>
>>> Seems like getting rid of O_DIRECT here is the right thing to do,
>>> regardless of this.
>>
>> Agreed. I'll remove O_DIRECT from walreceiver.
>
> Here is the patch to do that.

Ooops! I found the bug in the patch. Here is the updated version.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From: Heikki Linnakangas on
Fujii Masao wrote:
> On Wed, Feb 17, 2010 at 6:00 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>> On Wed, Feb 17, 2010 at 4:07 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>>> On Wed, Feb 17, 2010 at 3:03 PM, Magnus Hagander <magnus(a)hagander.net> wrote:
>>>> In that case, O_DIRECT would be counterproductive, no? It maps to
>>>> FILE_FLAG_NOI_BUFFERING, which makes sure it doesn't go into the
>>>> cache. So the read in the startup proc is actually guaranteed to
>>>> reuqire a physical read - of something we just wrote, so it'll almost
>>>> certainly end up waiting for a rotation, no?
>>>>
>>>> Seems like getting rid of O_DIRECT here is the right thing to do,
>>>> regardless of this.
>>> Agreed. I'll remove O_DIRECT from walreceiver.
>> Here is the patch to do that.
>
> Ooops! I found the bug in the patch. Here is the updated version.

If I'm reading the patch correctly, when wal_sync_method is 'open_sync',
walreceiver nevertheless opens the WAL file without the O_DIRECT flag.
When it later flushes it in XLogWalRcvFlush() by issue_xlog_fsync(),
issue_xlog_fsync() will do nothing because it assumes the write() synced
it already. So the data written isn't being forced to disk at all.

How about just forcing sync_method to 'fsync' in walreceiver?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> If I'm reading the patch correctly, when wal_sync_method is 'open_sync',
> walreceiver nevertheless opens the WAL file without the O_DIRECT flag.
> When it later flushes it in XLogWalRcvFlush() by issue_xlog_fsync(),
> issue_xlog_fsync() will do nothing because it assumes the write() synced
> it already. So the data written isn't being forced to disk at all.

When 'open_sync' is chosen, the WAL file is opened with O_SYNC or O_FSYNC
flag. So I think that write() flushes the data to disk even if O_DIRECT
flag is not given. Am I missing something?

> How about just forcing sync_method to 'fsync' in walreceiver?

In win32, O_DSYNC seems to be preferred to 'fsync' so far. So I'm not sure
if reshuffling of priority is harmless.
http://archives.postgresql.org/pgsql-hackers-win32/2005-03/msg00148.php

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
Fujii Masao wrote:
> On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas
> <heikki.linnakangas(a)enterprisedb.com> wrote:
>> If I'm reading the patch correctly, when wal_sync_method is 'open_sync',
>> walreceiver nevertheless opens the WAL file without the O_DIRECT flag.
>> When it later flushes it in XLogWalRcvFlush() by issue_xlog_fsync(),
>> issue_xlog_fsync() will do nothing because it assumes the write() synced
>> it already. So the data written isn't being forced to disk at all.
>
> When 'open_sync' is chosen, the WAL file is opened with O_SYNC or O_FSYNC
> flag. So I think that write() flushes the data to disk even if O_DIRECT
> flag is not given. Am I missing something?

Ah, ok, you're right.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Magnus Hagander on
2010/2/18 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com>:
> Fujii Masao wrote:
>> On Thu, Feb 18, 2010 at 5:28 AM, Heikki Linnakangas
>> <heikki.linnakangas(a)enterprisedb.com> wrote:
>>> If I'm reading the patch correctly, when wal_sync_method is 'open_sync',
>>> walreceiver nevertheless opens the WAL file without the O_DIRECT flag.
>>> When it later flushes it in XLogWalRcvFlush() by issue_xlog_fsync(),
>>> issue_xlog_fsync() will do nothing because it assumes the write() synced
>>> it already. So the data written isn't being forced to disk at all.
>>
>> When 'open_sync' is chosen, the WAL file is opened with O_SYNC or O_FSYNC
>> flag. So I think that write() flushes the data to disk even if O_DIRECT
>> flag is not given. Am I missing something?
>
> Ah, ok, you're right.

Yes, I believe the difference is that with O_DIRECT it bypasses the
cache completely. Without it, we still sync it out, but it also goes
into the cache.

O_DIRECT helps us when we're not going to read the file again, because
we don't waste cache on it. If we are, which is the case here, it
should be really bad for performance, since we actually have to do a
physical read.

Incidentally, that should also apply to general WAL when archive_mdoe
is on. Do we optimize for that?


--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers