no universally correct setting for fsync [PgSql]

Prev: [HACKERS] no universally correct setting for fsync
Next: [HACKERS] beta to release

From: Josh Berkus on 10 May 2010 16:35

All,

Updated docs based on tracking this discussion. fsync through full page
writes recorded below.

============

<varlistentry id="guc-fsync" xreflabel="fsync">
<indexterm>
<primary><varname>fsync</> configuration parameter</primary>
</indexterm>
<term><varname>fsync</varname> (<type>boolean</type>)</term>
<listitem>
<para>
If this parameter is on, the <productname>PostgreSQL</> server
will try to make sure that updates are physically written to
disk, by issuing <function>fsync()</> system calls or various
equivalent methods (see <xref linkend="guc-wal-sync-method">).
This ensures that the database cluster can recover to a
consistent state after an operating system or hardware crash.
</para>

<para>
While turning off <varname>fsync</varname> is often a performance
benefit, this can result in unrecoverable data corruption in the
event
of an unexpected system shutdown or crash. Thus it is only
advisable
to turn off <varname>fsync</varname> if you can easily recreate
your entire database from external data.
</para>

<para>
Examples of safe times to turn off <varname>fsync</varname> would be
when initially loading a new database from a backup file, on a
database which is
only used for processing statistics on an hourly basis and is then
deleted,
or on a reporting read-only clone of your database which gets
recreated very
night and is not used for failover. High quality hardware alone
is not a
sufficient justification for turning off <varname>fsync</varname>.
</para>

<para>
In many situations, turning off <xref
linkend="guc-synchronous-commit">
for noncritical transactions can provide much of the potential
performance benefit of turning off <varname>fsync</varname>, without
the attendant risks of data corruption.
</para>

<para>
<varname>fsync</varname> can only be set in the
<filename>postgresql.conf</>
file or on the server command line.
If you turn this parameter off, also consider turning off
<xref linkend="guc-full-page-writes">.
</para>
</listitem>
</varlistentry>

<varlistentry id="guc-synchronous-commit"
xreflabel="synchronous_commit">
<term><varname>synchronous_commit</varname>
(<type>boolean</type>)</term>
<indexterm>
<primary><varname>synchronous_commit</> configuration
parameter</primary>
</indexterm>
<listitem>
<para>
Specifies whether transaction commit will wait for WAL records
to be written to disk before the command returns a <quote>success</>
indication to the client. The default, and safe, setting is
<literal>on</>. When <literal>off</>, there can be a delay between
when success is reported to the client and when the transaction is
really guaranteed to be safe against a server crash. (The maximum
delay is three times <xref linkend="guc-wal-writer-delay">.) Unlike
<xref linkend="guc-fsync">, setting this parameter to
<literal>off</>
does not create any risk of database inconsistency: a crash might
result in some recent allegedly-committed transactions being
lost, but
the database state will be just the same as if those
transactions had
been aborted cleanly. So, turning
<varname>synchronous_commit</> off
can be a useful alternative when performance is more important than
exact certainty about the durability of a transaction. For more
discussion see <xref linkend="wal-async-commit">.
</para>
<para>
This parameter can be changed at any time; the behavior for any
one transaction is determined by the setting in effect when it
commits. It is therefore possible, and useful, to have some
transactions commit synchronously and others asynchronously.
For example, to make a single multi-statement transaction commit
asynchronously when the default is the opposite, issue <command>SET
LOCAL synchronous_commit TO OFF</> within the transaction.
</para>
</listitem>
</varlistentry>

<varlistentry id="guc-wal-sync-method" xreflabel="wal_sync_method">
<term><varname>wal_sync_method</varname> (<type>enum</type>)</term>
<indexterm>
<primary><varname>wal_sync_method</> configuration
parameter</primary>
</indexterm>
<listitem>
<para>
Method used for forcing WAL updates out to disk.
If <varname>fsync</varname> is off then this setting is irrelevant,
since WAL file updates will not be forced out at all.
Possible values are:
</para>
<itemizedlist>
<listitem>
<para>
<literal>open_datasync</> (write WAL files with
<function>open()</> option <symbol>O_DSYNC</>)
</para>
</listitem>
<listitem>
<para>
<literal>fdatasync</> (call <function>fdatasync()</> at each
commit)
</para>
</listitem>
<listitem>
<para>
<literal>fsync_writethrough</> (call <function>fsync()</> at
each commit, forcing write-through of any disk write cache)
</para>
</listitem>
<listitem>
<para>
<literal>fsync</> (call <function>fsync()</> at each commit)
</para>
</listitem>
<listitem>
<para>
<literal>open_sync</> (write WAL files with <function>open()</>
option <symbol>O_SYNC</>)
</para>
</listitem>
</itemizedlist>
<para>
Not all of these choices are available on all platforms.
The default is the first method in the above list that is supported
by the platform.
The <literal>open_</>* options also use <literal>O_DIRECT</> if
available.
The utility <filename>src/tools/fsync</> in the PostgreSQL
source tree
can do performance testing of various fsync methods.
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
</para>
</listitem>
</varlistentry>

<varlistentry id="guc-full-page-writes" xreflabel="full_page_writes">
<indexterm>
<primary><varname>full_page_writes</> configuration
parameter</primary>
</indexterm>
<term><varname>full_page_writes</varname>
(<type>boolean</type>)</term>
<listitem>
<para>
When this parameter is on, the <productname>PostgreSQL</> server
writes the entire content of each disk page to WAL during the
first modification of that page after a checkpoint.
This is needed because
a page write that is in process during an operating system crash
might
be only partially completed, leading to an on-disk page
that contains a mix of old and new data. The row-level change data
normally stored in WAL will not be enough to completely restore
such a page during post-crash recovery. Storing the full page image
guarantees that the page can be correctly restored, but at the price
of increasing the amount of data that must be written to WAL.
(Because WAL replay always starts from a checkpoint, it is
sufficient
to do this during the first change of each page after a checkpoint.
Therefore, one way to reduce the cost of full-page writes is to
increase the checkpoint interval parameters.)
</para>

<para>
Turning this parameter off speeds normal operation, but
might lead to either unrecoverable data corruption, or silent
data corruption, after a system failure. The risks are similar
to turning off
<varname>fsync</varname>, though smaller, and it should be
turned off
only based on the same circumstances recommended for that parameter.
</para>

<para>
Turning off this parameter does not affect use of
WAL archiving for point-in-time recovery (PITR)
(see <xref linkend="continuous-archiving">).
</para>

<para>
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
The default is <literal>on</>.
</para>
</listitem>
</varlistentry>

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 10 May 2010 16:59

C�dric Villemain<cedric.villemain.debian(a)gmail.com> wrote:

> On a recent pg_restore -j 32, with perc 6i with BBU, RAID10 8 hd,
> results were not so bas with fsync turn on. (XFS with nobarrier su
> and sw)
> -- deactivate fsync
> time pg_restore -U postgres -d foodb -j 32 foo.psql
> real 170m0.527s
> user 43m12.914s
> sys 1m56.499s
> -- activate fsync
> time pg_restore -U postgres -d foodb -j 32 foo.psql
> real 177m0.121s
> user 42m54.581s
> sys 2m0.452s

Wow. In a situation where you save seven minutes (4%), it's hardly
worth turning off.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Josh Berkus on 10 May 2010 17:20

> Wow. In a situation where you save seven minutes (4%), it's hardly
> worth turning off.

I've had it be much higher, especially for really large databases.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Ross J. Reedstrom" on 10 May 2010 17:21

On Mon, May 10, 2010 at 01:35:32PM -0700, Josh Berkus wrote:
> deleted,
> or on a reporting read-only clone of your database which gets
> recreated very
> night and is not used for failover. High quality hardware alone

s/very/every/
or
s/very night/periodically/

Ross
--
Ross Reedstrom, Ph.D. reedstrm(a)rice.edu
Systems Engineer & Admin, Research Scientist phone: 713-348-6166
The Connexions Project http://cnx.org fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on 10 May 2010 19:03

Josh Berkus wrote:
>> Wow. In a situation where you save seven minutes (4%), it's hardly
>> worth turning off.
>>
>
> I've had it be much higher, especially for really large databases.
>

Cedric's system had a non-volatile write cache in it. In that case, a
few percentage points of improvement is normal--the overhead of fsync is
very low. In the case where you don't have one of those, and the write
cache on the drives are turned off for safety too, I've seen turning
fsync off be a 40X speedup--100 inserts/second jumping to 4000TPS.
(This was before synchronous_commit).

The real question is how much of a speed-up fsync provides compared to
the same workload with synchronous_commit disabled. The only case for
fsync=off is one where that number is much faster. That's the case on
some low-level operations (I seem to recall there is no async commit
speedup for CREATE DATABASE for example). But for most of what people
want to speed, just killing sync commit while keeping fsync is on is
good enough. I suspect there are still some bulk-load workloads where
fsync=off helps beyond just going for async commit, but they're tougher
to find and the difference isn't huge relative to total load times.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: [HACKERS] no universally correct setting for fsync
Next: [HACKERS] beta to release