Prev: staging: hv: Fix race condition on IC channel initialization (modified)
Next: [PATCH] power_end event (Resend)
From: Artem Bityutskiy on 29 May 2010 04:10 On Fri, 2010-05-28 at 13:29 -0700, Andrew Morton wrote: > void mark_sb_dirty(struct super_block *sb) > { > sb->s_dirty = 1; > > if (!supers_timer_armed) { > spin_lock(&supers_timer_lock); > if (!supers_timer_armed) { > bdi_arm_supers_timer(); > supers_timer_armed = 1; > } > } else if (supers_timer_armed == -1) > spin_lock(&supers_timer_lock); > if (supers_timer_armed == -1) > supers_timer_armed = 1; > spin_unlock(&supers_timer_lock); > } > } > > I didn't try very hard there, but you get the idea: examine the state > before taking that expensive global spinlock, so we only end up taking > the lock once per five seconds, rather than once per possible > superblock dirtying. That's like a six-orders-of-magnitude reduction > in locking frequency, which is worth putting some effort into. Andrew, thanks for review! I just did not consider spinlock to be expensive because I thought that marking superblock as dirty is a relatively rare operation. And my small experiments kind of confirmed that. But Nick suggested a good locking scheme which uses only smp_mb() in this thread, which I am going to stick with. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Artem Bityutskiy on 31 May 2010 04:30 On Fri, 2010-05-28 at 01:44 +1000, Nick Piggin wrote: > > if (supers_dirty) > > bdi_arm_supers_timer(); > > set_current_state(TASK_INTERRUPTIBLE); > > schedule(); > But we cannot do the above, because again the timer might go off > before we set current state. We'd lose the wakeup and never wake > up again. > > Putting it inside set_current_state() should be OK. I suppose. Hmm, but it looks like we cannot do that either. If we do set_current_state(TASK_INTERRUPTIBLE); if (supers_dirty) bdi_arm_supers_timer(); schedule(); and the kernel is preemptive, is it possible that we get preempted before we run 'bdi_arm_supers_timer()', but after we do 'set_current_state(TASK_INTERRUPTIBLE)'. And we will never wake up if the timer armed in mark_sb_dirty() went off. So it looks like this is the way to go: /* * Disable preemption for a while to make sure we are not * preempted before the timer is armed. */ preempt_disable(); set_current_state(TASK_INTERRUPTIBLE); if (supers_dirty) bdi_arm_supers_timer(); preempt_enable(); schedule(); -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on 31 May 2010 04:40 On Mon, May 31, 2010 at 11:25:52AM +0300, Artem Bityutskiy wrote: > On Fri, 2010-05-28 at 01:44 +1000, Nick Piggin wrote: > > > if (supers_dirty) > > > bdi_arm_supers_timer(); > > > set_current_state(TASK_INTERRUPTIBLE); > > > schedule(); > > > But we cannot do the above, because again the timer might go off > > before we set current state. We'd lose the wakeup and never wake > > up again. > > > > Putting it inside set_current_state() should be OK. I suppose. > > Hmm, but it looks like we cannot do that either. If we do > > set_current_state(TASK_INTERRUPTIBLE); > if (supers_dirty) > bdi_arm_supers_timer(); > schedule(); > > and the kernel is preemptive, is it possible that we get preempted > before we run 'bdi_arm_supers_timer()', but after we do > 'set_current_state(TASK_INTERRUPTIBLE)'. And we will never wake up if > the timer armed in mark_sb_dirty() went off. > > So it looks like this is the way to go: > > /* > * Disable preemption for a while to make sure we are not > * preempted before the timer is armed. > */ > preempt_disable(); > set_current_state(TASK_INTERRUPTIBLE); > if (supers_dirty) > bdi_arm_supers_timer(); > preempt_enable(); > schedule(); This should not be required because preempt is transparent to these task sleep/schedule APIs. The preempt event will not clear TASK_INTERRUPTIBLE, and so the timer wakeup will set it to TASK_RUNNING (whether or not it has called schedule() yet and whether or not it is currently preempted). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Artem Bityutskiy on 31 May 2010 10:10 On Thu, 2010-05-27 at 07:50 +0100, Al Viro wrote: > Note, e.g., that in your series you've touched udf; it can set s_dirt > until the cows come home, but without ->write_super() it'll be ignored > by everything in VFS and fs/udf itself never looks at the damn thing. > > A look around it shows fs/sysv, where we never clean the damn flag anymore > for r/w mounts. Yes, really (got broken a year ago, nobody noticed). > > Or, e.g., BFS - there we have ->write_super() mark the buffer_head that > contains on-disk sb dirty, and the only place that sets ->s_dirt is doing > that immediately after having marked the same bh dirty itself. Interesting > place, at that - bfs_fill_super() at r/w mount time... Note that ->sync_fs() > there does *not* wait for anything, which is not the right thing to do. > > IOW, this thing is a good topic for code review; I suspect that quite a few > users might be gone as the result. Al, you requested me to review s_dirt usage, well, I'm trying now. One thing I do not understand is s_dirt serialization, which seems to be just absent in some FSes. I checked affs and ext2. E.g., affs does: affs_alloc_block() { mark_buffer_dirty(bh); sb->s_dirt = 1; } vs affs_write_super() { affs_commit_super(); /* YYY: what if sb is marked as dirty right here? */ sb->s_dirt = 0; } vs /* This wakes up periodically */ sync_super() { if (sb->s_root && sb->s_dirt) sb->s_op->write_super(sb); } ext2 seems to be doing something similar. It seems to me that FSes should serialize s_dirt changes somehow, but they don't? Why this is not a problem? -- Best Regards, Artem Bityutskiy (Артём Битюцкий) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Al Viro on 4 Jun 2010 00:30 On Mon, May 31, 2010 at 05:07:00PM +0300, Artem Bityutskiy wrote: > you requested me to review s_dirt usage, well, I'm trying now. One thing > I do not understand is s_dirt serialization, which seems to be just > absent in some FSes. I checked affs and ext2. E.g., affs does: > > affs_alloc_block() > { > mark_buffer_dirty(bh); > sb->s_dirt = 1; > } > > vs > > affs_write_super() > { > affs_commit_super(); > /* YYY: what if sb is marked as dirty right here? */ > sb->s_dirt = 0; > } > > vs > > /* This wakes up periodically */ > sync_super() > { > if (sb->s_root && sb->s_dirt) > sb->s_op->write_super(sb); > } > > ext2 seems to be doing something similar. It seems to me that FSes > should serialize s_dirt changes somehow, but they don't? Why this is not > a problem? I suspect that most of those used to rely on lock_super() way back. In case of ext2_sync_super() we probably want just to move ->s_dirt = 0 into the very beginning; no serialization is really needed beyond (_maybe_) some barriers. No idea about affs, needs to be checked. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: staging: hv: Fix race condition on IC channel initialization (modified) Next: [PATCH] power_end event (Resend) |