From: Roman Kononov on
Under some workload, once per ~10 seconds, I'm getting the following warnings
with 2.6.32.13 and 2.6.33.4 (x86_64). Why are they?

Thanks.

May 22 23:53:13 hrech kernel: WARNING: at /home/stuff/base/linux-2.6.32.13/fs/xfs/linux-2.6/xfs_lrw.c:714 xfs_write+0x8a2/0x8c0()
May 22 23:53:13 hrech kernel: Modules linked in: ib_mthca sata_nv 3w_9xxx
May 22 23:53:13 hrech kernel: Pid: 30650, comm: postmaster Not tainted 2.6.32.13 #2
May 22 23:53:13 hrech kernel: Call Trace:
May 22 23:53:13 hrech kernel: [<ffffffff8118baf2>] ? xfs_write+0x8a2/0x8c0
May 22 23:53:13 hrech kernel: [<ffffffff8118baf2>] ? xfs_write+0x8a2/0x8c0
May 22 23:53:13 hrech kernel: [<ffffffff8103a775>] ? warn_slowpath_common+0x85/0xb0
May 22 23:53:13 hrech kernel: [<ffffffff8118baf2>] ? xfs_write+0x8a2/0x8c0
May 22 23:53:13 hrech kernel: [<ffffffff811b7293>] ? cpumask_next_and+0x23/0x40
May 22 23:53:13 hrech kernel: [<ffffffff81036826>] ? select_task_rq_fair+0x326/0x6a0
May 22 23:53:13 hrech kernel: [<ffffffff810a7869>] ? do_sync_write+0xd9/0x120
May 22 23:53:13 hrech kernel: [<ffffffff8104ef20>] ? autoremove_wake_function+0x0/0x30
May 22 23:53:13 hrech kernel: [<ffffffff81037b1d>] ? wake_up_new_task+0x9d/0xc0
May 22 23:53:13 hrech kernel: [<ffffffff81039b02>] ? do_fork+0x102/0x330
May 22 23:53:13 hrech kernel: [<ffffffff810a8088>] ? vfs_write+0xc8/0x180
May 22 23:53:13 hrech kernel: [<ffffffff810a88a1>] ? sys_pwrite64+0x91/0xa0
May 22 23:53:13 hrech kernel: [<ffffffff8100bc6b>] ? system_call_fastpath+0x16/0x1b
May 22 23:53:13 hrech kernel: ---[ end trace 615b846a6bbdf833 ]---


May 22 09:06:25 hrech kernel: WARNING: at /home/stuff/base/linux-2.6.33.4/fs/xfs/linux-2.6/xfs_lrw.c:651 xfs_write+0x961/0x970()
May 22 09:06:25 hrech kernel: Modules linked in: dm_mod ib_mthca sata_nv 3w_9xxx
May 22 09:06:25 hrech kernel: Pid: 1937, comm: postmaster Not tainted 2.6.33.4 #2
May 22 09:06:25 hrech kernel: Call Trace:
May 22 09:06:25 hrech kernel: [<ffffffff81036bb3>] ? warn_slowpath_common+0x73/0xb0
May 22 09:06:25 hrech kernel: [<ffffffff8119cce1>] ? xfs_write+0x961/0x970
May 22 09:06:25 hrech kernel: [<ffffffff810b2b3f>] ? do_sync_write+0xbf/0x100
May 22 09:06:25 hrech kernel: [<ffffffff81030a52>] ? wake_up_new_task+0xc2/0xe0
May 22 09:06:25 hrech kernel: [<ffffffff81035f60>] ? do_fork+0xf0/0x380
May 22 09:06:25 hrech kernel: [<ffffffff810b3236>] ? vfs_write+0xb6/0x170
May 22 09:06:25 hrech kernel: [<ffffffff810b36a3>] ? sys_pwrite64+0x83/0xa0
May 22 09:06:25 hrech kernel: [<ffffffff81002ceb>] ? system_call_fastpath+0x16/0x1b
May 22 09:06:25 hrech kernel: ---[ end trace 62b123c1948e55fa ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on
On Sun, May 23, 2010 at 12:20:23AM -0500, Roman Kononov wrote:
> Under some workload, once per ~10 seconds, I'm getting the following warnings
> with 2.6.32.13 and 2.6.33.4 (x86_64). Why are they?

You've got some workload that is mixing direct IO writes with some
form of buffered or mmap IO on the same file and they are racing.
Mixing different types of IO on the one inode is also known as A
Really Bad Idea because there is no guarantee of coherency between
them....

Can you find out what the application is triggering this?

Cheers,

Dave.
--
Dave Chinner
david(a)fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roman Kononov on
On 2010-05-23, 20:18:56 +1000, Dave Chinner <david(a)fromorbit.com> wrote:
> You've got some workload that is mixing direct IO writes with some
> form of buffered or mmap IO on the same file and they are racing.
> Mixing different types of IO on the one inode is also known as A
> Really Bad Idea because there is no guarantee of coherency between
> them....
>
> Can you find out what the application is triggering this?

This is severely modified Postgresql, which does mix direct IO with
buffered one.

You say "they are racing". Do you mean that this can cause file system
corruption? Doest it simply warn that direct user data races with
buffered user data and one of them wins? This warning "taints" the
kernel. Should it be safe to do different types of IOs on different
non-overlapping 4-KiB-aligned regions of the same file (I am unsure
if this is what the application really does)?

Thanks,

Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on
On Sun, May 23, 2010 at 09:23:44AM -0500, Roman Kononov wrote:
> On 2010-05-23, 20:18:56 +1000, Dave Chinner <david(a)fromorbit.com> wrote:
> > You've got some workload that is mixing direct IO writes with some
> > form of buffered or mmap IO on the same file and they are racing.
> > Mixing different types of IO on the one inode is also known as A
> > Really Bad Idea because there is no guarantee of coherency between
> > them....
> >
> > Can you find out what the application is triggering this?
>
> This is severely modified Postgresql, which does mix direct IO with
> buffered one.

I hope you keep plenty of backups, then...

> You say "they are racing". Do you mean that this can cause file system
> corruption?

.... because it's Not filesystem corruption you need to be worried
about, it's *silent data corruption* that these races can cause.

> Doest it simply warn that direct user data races with
> buffered user data and one of them wins?

Yes, that's right. No guarantee of who wins is given, though.

> This warning "taints" the kernel.

Yup, the application is doing something dangerous, and this warning
is there to let us know that the data corruption is the user's
fault, not the filesystem...

> Should it be safe to do different types of IOs on different
> non-overlapping 4-KiB-aligned regions of the same file (I am unsure
> if this is what the application really does)?

Yes, it should be safe, but the kernel code can't know whether this
is true or not - there are no specific interlocks with direct IO to
prevent concurrent buffered IO to the same region while a direct IO
is in progress. XFS does best effort attempts to maintain coherency
does not provide any guarantees, hence the warning when known race
conditions are tripped.

Cheers,

Dave.
--
Dave Chinner
david(a)fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ilia Mirkin on
Sorry to pick up an old-ish thread, but I have a similar situation:

On Sun, May 23, 2010 at 9:19 PM, Dave Chinner <david(a)fromorbit.com> wrote:
> On Sun, May 23, 2010 at 09:23:44AM -0500, Roman Kononov wrote:
>> On 2010-05-23, 20:18:56 +1000, Dave Chinner <david(a)fromorbit.com> wrote:
>> > Can you find out what the application is triggering this?

I noticed this happening with mysql and xtrabackup -- the latter opens
up mysql's files while mysql is still running (and modifying its own
files) and backs them up in a (hopefully) safe way. mysql had been
running on the machine without any such warnings for a while before we
ran the backup, so I'm pretty sure that the backup is involved,
although its process is never listed. Specifically the warning is:

[2584257.839386] ------------[ cut here ]------------
[2584257.839395] WARNING: at fs/xfs/linux-2.6/xfs_lrw.c:651
xfs_write+0x3dc/0x784()
[2584257.839398] Hardware name: PowerEdge R710
[2584257.839399] Modules linked in: nfsd cifs iTCO_wdt iTCO_vendor_support
[2584257.839406] Pid: 7761, comm: mysqld Not tainted 2.6.33-gentoo-r2 #1
[2584257.839407] Call Trace:
[2584257.839411] [<ffffffff8120da46>] ? xfs_write+0x3dc/0x784
[2584257.839415] [<ffffffff81038733>] warn_slowpath_common+0x77/0xa4
[2584257.839417] [<ffffffff8103876f>] warn_slowpath_null+0xf/0x11
[2584257.839419] [<ffffffff8120da46>] xfs_write+0x3dc/0x784
[2584257.839424] [<ffffffff810033ce>] ? apic_timer_interrupt+0xe/0x20
[2584257.839427] [<ffffffff8120a51a>] xfs_file_aio_write+0x5a/0x5c
[2584257.839430] [<ffffffff810d7cbe>] do_sync_write+0xc0/0x106
[2584257.839435] [<ffffffff810ff862>] ? __fsnotify_parent+0xc7/0xd3
[2584257.839437] [<ffffffff810d8624>] vfs_write+0xab/0x105
[2584257.839439] [<ffffffff810d86da>] sys_pwrite64+0x5c/0x7d
[2584257.839442] [<ffffffff81002a6b>] system_call_fastpath+0x16/0x1b
[2584257.839444] ---[ end trace 8b0c2a6e5e86745f ]---

> Yes, it should be safe, but the kernel code can't know whether this
> is true or not - there are no specific interlocks with direct IO to
> prevent concurrent buffered IO to the same region while a direct IO
> is in progress. XFS does best effort attempts to maintain coherency
> does not provide any guarantees, hence the warning when known race
> conditions are tripped.

Would it be safe to remove the warning at
fs/xfs/linux-2.6/xfs_lrw.c:651 (which looks like it has moved to
xfs_file.c in 2.6.34)? It seems undesirable to get a long stream of
these (51 in this particular instance) every time we run a backup...
IOW, is the warning purely something along the lines of "Userspace is
doing something wonky, but the underlying FS will still be fine no
matter what" kind of deal, or could there be an actual problem with
the XFS metadata itself?

Thanks for any advice,

Ilia Mirkin
imirkin(a)alum.mit.edu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/