no universally correct setting for fsync [PgSql]

Prev: [HACKERS] no universally correct setting for fsync
Next: [HACKERS] beta to release

From: Greg Stark on 10 May 2010 13:46

On Mon, May 10, 2010 at 4:55 PM, Kevin Grittner
<Kevin.Grittner(a)wicourts.gov> wrote:
> Robert Haas <robertmhaas(a)gmail.com> wrote:
>
>> "It might be safe" is a bit of a waffle. It would be nice if we
>> could provide some more clear guidance as to whether it is or is
>> not, or how someone could go about testing their hardware to find
>> out.
>
> I think that the issue is that you could have corruption if some,
> but not all, disk sectors from a page were written from OS cache to
> controller cache when a failure occurred. The window would be small
> for a RAM-to-RAM write, but it wouldn't be entirely *safe* unless
> there's some OS/driver environment where you could count on all the
> sectors making it or none of them making it for every single page.
> Does such an environment exist?

The reason for the waffle is that the following sentence describes a
whole set of environments based the following description:

> > ? ? ? ?if you have hardware (such as a battery-backed
> > ? ? ? ?disk controller) or file-system software that reduces the risk
> > ? ? ? ?of partial page writes to an acceptably low level

Depending on which set of hardware and how low the risk is it might be safe.

I think with WAFL or ZFS it's entirely safe. There may be other
filesystems with similar guarantees. With a BBU the risk might be very
low -- but it might not, it would be hard to determine without a
detailed analysis of the entire stack from the buffer cache,
filesystem, lvm, hardware drivers, BBU design, etc.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Joshua D. Drake" on 10 May 2010 14:42

On Mon, 2010-05-10 at 18:46 +0100, Greg Stark wrote:
> On Mon, May 10, 2010 at 4:55 PM, Kevin Grittner
> <Kevin.Grittner(a)wicourts.gov> wrote:
> > Robert Haas <robertmhaas(a)gmail.com> wrote:
> >
> >> "It might be safe" is a bit of a waffle. It would be nice if we
> >> could provide some more clear guidance as to whether it is or is
> >> not, or how someone could go about testing their hardware to find
> >> out.
> >
> > I think that the issue is that you could have corruption if some,
> > but not all, disk sectors from a page were written from OS cache to
> > controller cache when a failure occurred. The window would be small
> > for a RAM-to-RAM write, but it wouldn't be entirely *safe* unless
> > there's some OS/driver environment where you could count on all the
> > sectors making it or none of them making it for every single page.
> > Does such an environment exist?
>
> The reason for the waffle is that the following sentence describes a
> whole set of environments based the following description:
>
> > > ? ? ? ?if you have hardware (such as a battery-backed
> > > ? ? ? ?disk controller) or file-system software that reduces the risk
> > > ? ? ? ?of partial page writes to an acceptably low level
>
> Depending on which set of hardware and how low the risk is it might be safe.
>
> I think with WAFL or ZFS it's entirely safe. There may be other
> filesystems with similar guarantees. With a BBU the risk might be very
> low -- but it might not, it would be hard to determine without a
> detailed analysis of the entire stack from the buffer cache,
> filesystem, lvm, hardware drivers, BBU design, etc.
>

The answer to this is:

PostgreSQL.org recommends that this setting be left on at all times.
Turning it off, may lead to data corruption.

Anything else is circumstantial and based on knowledge and facts we
don't have about environmental factors.

Joshua D. Drake

> --
> greg
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 10 May 2010 15:00

"Joshua D. Drake" <jd(a)commandprompt.com> wrote:

> The answer to this is:
>
> PostgreSQL.org recommends that this setting be left on at all
> times. Turning it off, may lead to data corruption.
>
> Anything else is circumstantial and based on knowledge and facts
> we don't have about environmental factors.

Perhaps Josh's language for fsync could be modified to work here
(we're now talking about full_page_writes, for anyone who's lost
track):

| it is only advisable to turn off fsync if you can easily recreate
| your entire database from external data.

That covers bulk loads to an empty or just-backed-up database and
entirely redundant databases. Saying it should never be turned off
would tend to make one wonder why we have the setting at all.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 10 May 2010 15:57

"Kevin Grittner" <Kevin.Grittner(a)wicourts.gov> writes:
> Perhaps Josh's language for fsync could be modified to work here
> (we're now talking about full_page_writes, for anyone who's lost
> track):

> | it is only advisable to turn off fsync if you can easily recreate
> | your entire database from external data.

> That covers bulk loads to an empty or just-backed-up database and
> entirely redundant databases. Saying it should never be turned off
> would tend to make one wonder why we have the setting at all.

+1. Perhaps for both of them, we should specify that the intended
use-case is for improving performance during initial database load
and similar cases.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: =?ISO-8859-1?Q?C=E9dric_Villemain?= on 10 May 2010 16:22

2010/5/8 Bernd Helmle <mailings(a)oopsware.de>:
>
>
> --On 7. Mai 2010 09:48:53 -0500 Kevin Grittner <Kevin.Grittner(a)wicourts.gov>
> wrote:
>
>> I think it goes beyond "tweaking" -- I think we should have a bald
>> statement like "don't turn this off unless you're OK with losing the
>> entire contents of the database cluster." A brief listing of some
>> cases where that is OK might be illustrative.
>>
>
> +1
>
>> I never meant to suggest any statement in that section is factually
>> wrong; it's just all too rosy, leading people to believe it's no big
>> deal to turn it off.
>
> I think one mistake in this paragraph is the passing mention of
> "performance". I've seen installations in the past with fsync=off only
> because the admin was pressured to get instantly "more speed" out of the
> database (think of "fast_mode=on"). In my opinion, phrases like "performance
> penalty" are misleading, if you need that setting in 99% of all use cases
> for reliable operation.
>
> I've recently even started to wonder if the performance gain with fsync=off
> is still that large on modern hardware. While testing large migration
> procedures to a new version some time ago (on an admitedly fast storage) i
> forgot here and then to turn it off, without a significant degradation in
> performance.

On a recent pg_restore -j 32, with perc 6i with BBU, RAID10 8 hd,
results were not so bas with fsync turn on. (XFS with nobarrier su and
sw)
-- deactivate fsync
time pg_restore -U postgres -d foodb -j 32 foo.psql
real 170m0.527s
user 43m12.914s
sys 1m56.499s
-- activate fsync
time pg_restore -U postgres -d foodb -j 32 foo.psql
real 177m0.121s
user 42m54.581s
sys 2m0.452s

>
>
> --
> Thanks
>
> Bernd
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: [HACKERS] no universally correct setting for fsync
Next: [HACKERS] beta to release