Prev: trace-cmd: Add to ignore event not found error
Next: tile: remove homegrown L1_CACHE_ALIGN macro
From: Christoph Hellwig on 29 Jun 2010 09:10 This should actually be on it's way to Linus for .35, shouldn't it? On Thu, Jun 24, 2010 at 01:02:14PM +1000, npiggin(a)suse.de wrote: > list_for_each_entry_safe is not suitable to protect against concurrent > modification of the list. 6754af6 introduced a race in sb walking. > > list_for_each_entry can use the trick of pinning the current entry in > the list before we drop and retake the lock because it subsequently > follows cur->next. However list_for_each_entry_safe saves n=cur->next > for following before entering the loop body, so when the lock is > dropped, n may be deleted. > > Signed-off-by: Nick Piggin <npiggin(a)suse.de> > --- > fs/dcache.c | 2 ++ > fs/super.c | 6 ++++++ > include/linux/list.h | 15 +++++++++++++++ > 3 files changed, 23 insertions(+) > > Index: linux-2.6/fs/dcache.c > =================================================================== > --- linux-2.6.orig/fs/dcache.c > +++ linux-2.6/fs/dcache.c > @@ -590,6 +590,8 @@ static void prune_dcache(int count) > up_read(&sb->s_umount); > } > spin_lock(&sb_lock); > + /* lock was dropped, must reset next */ > + list_safe_reset_next(sb, n, s_list); > count -= pruned; > __put_super(sb); > /* more work left to do? */ > Index: linux-2.6/fs/super.c > =================================================================== > --- linux-2.6.orig/fs/super.c > +++ linux-2.6/fs/super.c > @@ -374,6 +374,8 @@ void sync_supers(void) > up_read(&sb->s_umount); > > spin_lock(&sb_lock); > + /* lock was dropped, must reset next */ > + list_safe_reset_next(sb, n, s_list); > __put_super(sb); > } > } > @@ -405,6 +407,8 @@ void iterate_supers(void (*f)(struct sup > up_read(&sb->s_umount); > > spin_lock(&sb_lock); > + /* lock was dropped, must reset next */ > + list_safe_reset_next(sb, n, s_list); > __put_super(sb); > } > spin_unlock(&sb_lock); > @@ -585,6 +589,8 @@ static void do_emergency_remount(struct > } > up_write(&sb->s_umount); > spin_lock(&sb_lock); > + /* lock was dropped, must reset next */ > + list_safe_reset_next(sb, n, s_list); > __put_super(sb); > } > spin_unlock(&sb_lock); > Index: linux-2.6/include/linux/list.h > =================================================================== > --- linux-2.6.orig/include/linux/list.h > +++ linux-2.6/include/linux/list.h > @@ -544,6 +544,21 @@ static inline void list_splice_tail_init > &pos->member != (head); \ > pos = n, n = list_entry(n->member.prev, typeof(*n), member)) > > +/** > + * list_safe_reset_next - reset a stale list_for_each_entry_safe loop > + * @pos: the loop cursor used in the list_for_each_entry_safe loop > + * @n: temporary storage used in list_for_each_entry_safe > + * @member: the name of the list_struct within the struct. > + * > + * list_safe_reset_next is not safe to use in general if the list may be > + * modified concurrently (eg. the lock is dropped in the loop body). An > + * exception to this is if the cursor element (pos) is pinned in the list, > + * and list_safe_reset_next is called after re-taking the lock and before > + * completing the current iteration of the loop body. > + */ > +#define list_safe_reset_next(pos, n, member) \ > + n = list_entry(pos->member.next, typeof(*pos), member) > + > /* > * Double linked lists with a single pointer list head. > * Mostly useful for hash tables where the two pointer list head is > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ---end quoted text--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on 29 Jun 2010 11:00 On Tue, Jun 29, 2010 at 09:02:14AM -0400, Christoph Hellwig wrote: > This should actually be on it's way to Linus for .35, shouldn't it? Yeah, I was waiting for Al to reappear, but I think this is probably the nicest way to solve the problem. Linus? -- fs: fix superblock iteration race list_for_each_entry_safe is not suitable to protect against concurrent modification of the list. 6754af6 introduced a race in sb walking. list_for_each_entry can use the trick of pinning the current entry in the list before we drop and retake the lock because it subsequently follows cur->next. However list_for_each_entry_safe saves n=cur->next for following before entering the loop body, so when the lock is dropped, n may be deleted. Signed-off-by: Nick Piggin <npiggin(a)suse.de> --- fs/dcache.c | 2 ++ fs/super.c | 6 ++++++ include/linux/list.h | 15 +++++++++++++++ 3 files changed, 23 insertions(+) Index: linux-2.6/fs/dcache.c =================================================================== --- linux-2.6.orig/fs/dcache.c +++ linux-2.6/fs/dcache.c @@ -590,6 +590,8 @@ static void prune_dcache(int count) up_read(&sb->s_umount); } spin_lock(&sb_lock); + /* lock was dropped, must reset next */ + list_safe_reset_next(sb, n, s_list); count -= pruned; __put_super(sb); /* more work left to do? */ Index: linux-2.6/fs/super.c =================================================================== --- linux-2.6.orig/fs/super.c +++ linux-2.6/fs/super.c @@ -374,6 +374,8 @@ void sync_supers(void) up_read(&sb->s_umount); spin_lock(&sb_lock); + /* lock was dropped, must reset next */ + list_safe_reset_next(sb, n, s_list); __put_super(sb); } } @@ -405,6 +407,8 @@ void iterate_supers(void (*f)(struct sup up_read(&sb->s_umount); spin_lock(&sb_lock); + /* lock was dropped, must reset next */ + list_safe_reset_next(sb, n, s_list); __put_super(sb); } spin_unlock(&sb_lock); @@ -585,6 +589,8 @@ static void do_emergency_remount(struct } up_write(&sb->s_umount); spin_lock(&sb_lock); + /* lock was dropped, must reset next */ + list_safe_reset_next(sb, n, s_list); __put_super(sb); } spin_unlock(&sb_lock); Index: linux-2.6/include/linux/list.h =================================================================== --- linux-2.6.orig/include/linux/list.h +++ linux-2.6/include/linux/list.h @@ -544,6 +544,21 @@ static inline void list_splice_tail_init &pos->member != (head); \ pos = n, n = list_entry(n->member.prev, typeof(*n), member)) +/** + * list_safe_reset_next - reset a stale list_for_each_entry_safe loop + * @pos: the loop cursor used in the list_for_each_entry_safe loop + * @n: temporary storage used in list_for_each_entry_safe + * @member: the name of the list_struct within the struct. + * + * list_safe_reset_next is not safe to use in general if the list may be + * modified concurrently (eg. the lock is dropped in the loop body). An + * exception to this is if the cursor element (pos) is pinned in the list, + * and list_safe_reset_next is called after re-taking the lock and before + * completing the current iteration of the loop body. + */ +#define list_safe_reset_next(pos, n, member) \ + n = list_entry(pos->member.next, typeof(*pos), member) + /* * Double linked lists with a single pointer list head. * Mostly useful for hash tables where the two pointer list head is -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 29 Jun 2010 13:40 On Tue, Jun 29, 2010 at 7:56 AM, Nick Piggin <npiggin(a)suse.de> wrote: > On Tue, Jun 29, 2010 at 09:02:14AM -0400, Christoph Hellwig wrote: >> This should actually be on it's way to Linus for .35, shouldn't it? > > Yeah, I was waiting for Al to reappear, but I think this is > probably the nicest way to solve the problem. Linus? I'll apply it. We have a couple of oopses listed for the superblock iterator, and I haven't heard from Al. And the patch looks obviously fine, whether it's actually the cause of some of the bugs or not. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on 29 Jun 2010 13:50 On Tue, Jun 29, 2010 at 10:35:47AM -0700, Linus Torvalds wrote: > On Tue, Jun 29, 2010 at 7:56 AM, Nick Piggin <npiggin(a)suse.de> wrote: > > On Tue, Jun 29, 2010 at 09:02:14AM -0400, Christoph Hellwig wrote: > >> This should actually be on it's way to Linus for .35, shouldn't it? > > > > Yeah, I was waiting for Al to reappear, but I think this is > > probably the nicest way to solve the problem. Linus? > > I'll apply it. We have a couple of oopses listed for the superblock > iterator, and I haven't heard from Al. And the patch looks obviously > fine, whether it's actually the cause of some of the bugs or not. OK. I only have managed to get it into an infininte loop but I think it would be surely possible to oops it because the next pointer can be uninitialised memory at that point. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 29 Jun 2010 14:00 On Tue, Jun 29, 2010 at 10:41 AM, Nick Piggin <npiggin(a)suse.de> wrote: > On Tue, Jun 29, 2010 at 10:35:47AM -0700, Linus Torvalds wrote: >> >> I'll apply it. We have a couple of oopses listed for the superblock >> iterator, and I haven't heard from Al. And the patch looks obviously >> fine, whether it's actually the cause of some of the bugs or not. > > OK. I only have managed to get it into an infininte loop but I think > it would be surely possible to oops it because the next pointer can > be uninitialised memory at that point. Look for "2.6.35-rc3 oops trying to suspend" on lkml, for example. No guarantee that it's the same thing, but it's "iterate_supers()" getting an oops when it does "down_read(&sb->s_umount)". Which really looks suspiciously like "sb" just being totally bogus, most likely because of this same issue. So I dunno, but I asked Al to look at it, and haven't heard back. Regardless, I think your patch is the right thing to do (modulo any syntactic issues - and I think your final version was the best of the lot). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 Prev: trace-cmd: Add to ignore event not found error Next: tile: remove homegrown L1_CACHE_ALIGN macro |