From: john stultz on
On Mon, 2010-07-12 at 18:10 -0700, Fernando Lopez-Lezcano wrote:
> On Mon, 2010-07-12 at 16:53 -0700, john stultz wrote:
> >
> > Hrm. Ok.. I think the line 2100 above gives us a hint: (aparent == anon)
> > So if that were the case, we would have already locked aparent and that
> > would explain the blowup.
> >
> > How does it do with the following change?
>
> Ok, you are on to something. The machine did not crash hard!
> But the serial console printed this:

Sigh. Its never easy, is it? :)


> --------
> BUG: unable to handle kernel NULL pointer dereference at 0000008c
> IP: [<c045e50a>] rt_spin_lock_fastunlock.clone.2+0x6/0x3e
....
> Pid: 2855, comm: nautilus Not tainted
> 2.6.33.6-147.rt23.3.fc12.ccrma.i686.rt #3 P5K/EPU/P5K/EPU
> EIP: 0060:[<c045e50a>] EFLAGS: 00210246 CPU: 0
> EIP is at rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> EAX: 00000078 EBX: ef45393c ECX: 00000000 EDX: 00000078
> ESI: ef716edc EDI: 00000000 EBP: f1977c8c ESP: f1977c88
> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 preempt:00000000
> Process nautilus (pid: 2855, ti=f1976000 task=f2347130 task.ti=f1976000)
> Stack:
> ef45393c f1977c94 c0781206 f1977cc8 c04d842e 00000000 ef703e54 faadd5bc
> <0> f1977cdc 126bc87a ef703ddc 00000000 ef45393c ef716edc ef6f5494
> faafdc6c
> <0> f1977df8 faad9041 c3604b5c faafdc6c ef452bfc 00007e7f f5eb41e8
> 00000007
> Call Trace:
> [<c0781206>] ? rt_spin_unlock+0x8/0xa
> [<c04d842e>] ? d_materialise_unique+0x210/0x2aa

Can you gdb list *0xc04d842e ?

Thanks again for all the testing here! Its really appreciated!
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Fernando Lopez-Lezcano on
On Mon, 2010-07-12 at 18:40 -0700, john stultz wrote:
> On Mon, 2010-07-12 at 18:10 -0700, Fernando Lopez-Lezcano wrote:
> > On Mon, 2010-07-12 at 16:53 -0700, john stultz wrote:
> > >
> > > Hrm. Ok.. I think the line 2100 above gives us a hint: (aparent == anon)
> > > So if that were the case, we would have already locked aparent and that
> > > would explain the blowup.
> > >
> > > How does it do with the following change?
> >
> > Ok, you are on to something. The machine did not crash hard!
> > But the serial console printed this:
>
> Sigh. Its never easy, is it? :)

Hardly ever .... :-)
I have _read_ about stories of stuff being solved on the first try, ha.

> > --------
> > BUG: unable to handle kernel NULL pointer dereference at 0000008c
> > IP: [<c045e50a>] rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> ...
> > Pid: 2855, comm: nautilus Not tainted
> > 2.6.33.6-147.rt23.3.fc12.ccrma.i686.rt #3 P5K/EPU/P5K/EPU
> > EIP: 0060:[<c045e50a>] EFLAGS: 00210246 CPU: 0
> > EIP is at rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > EAX: 00000078 EBX: ef45393c ECX: 00000000 EDX: 00000078
> > ESI: ef716edc EDI: 00000000 EBP: f1977c8c ESP: f1977c88
> > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 preempt:00000000
> > Process nautilus (pid: 2855, ti=f1976000 task=f2347130 task.ti=f1976000)
> > Stack:
> > ef45393c f1977c94 c0781206 f1977cc8 c04d842e 00000000 ef703e54 faadd5bc
> > <0> f1977cdc 126bc87a ef703ddc 00000000 ef45393c ef716edc ef6f5494
> > faafdc6c
> > <0> f1977df8 faad9041 c3604b5c faafdc6c ef452bfc 00007e7f f5eb41e8
> > 00000007
> > Call Trace:
> > [<c0781206>] ? rt_spin_unlock+0x8/0xa
> > [<c04d842e>] ? d_materialise_unique+0x210/0x2aa
>
> Can you gdb list *0xc04d842e ?

(gdb) list *0xc04d842e
0xc04d842e is in d_materialise_unique (fs/dcache.c:2073).
2068 out_unalias:
2069 d_move_locked(alias, dentry);
2070 ret = alias;
2071 out_err:
2072 spin_unlock(&inode->i_lock);
2073 if (m2)
2074 mutex_unlock(m2);
2075 if (m1)
2076 mutex_unlock(m1);
2077 return ret;

> Thanks again for all the testing here! Its really appreciated!

No problem, not the first time (but it had been a while...)
-- Fernando


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on
On Wed, 2010-07-14 at 14:32 -0700, Fernando Lopez-Lezcano wrote:
> On Mon, 2010-07-12 at 20:06 -0700, Fernando Lopez-Lezcano wrote:
> > > > --------
> > > > BUG: unable to handle kernel NULL pointer dereference at 0000008c
> > > > IP: [<c045e50a>] rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > > ...
> > > > Pid: 2855, comm: nautilus Not tainted
> > > > 2.6.33.6-147.rt23.3.fc12.ccrma.i686.rt #3 P5K/EPU/P5K/EPU
> > > > EIP: 0060:[<c045e50a>] EFLAGS: 00210246 CPU: 0
> > > > EIP is at rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > > > EAX: 00000078 EBX: ef45393c ECX: 00000000 EDX: 00000078
> > > > ESI: ef716edc EDI: 00000000 EBP: f1977c8c ESP: f1977c88
> > > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 preempt:00000000
> > > > Process nautilus (pid: 2855, ti=f1976000 task=f2347130 task.ti=f1976000)
> > > > Stack:
> > > > ef45393c f1977c94 c0781206 f1977cc8 c04d842e 00000000 ef703e54 faadd5bc
> > > > <0> f1977cdc 126bc87a ef703ddc 00000000 ef45393c ef716edc ef6f5494
> > > > faafdc6c
> > > > <0> f1977df8 faad9041 c3604b5c faafdc6c ef452bfc 00007e7f f5eb41e8
> > > > 00000007
> > > > Call Trace:
> > > > [<c0781206>] ? rt_spin_unlock+0x8/0xa
> > > > [<c04d842e>] ? d_materialise_unique+0x210/0x2aa
> > >
> > > Can you gdb list *0xc04d842e ?
> >
> > (gdb) list *0xc04d842e
> > 0xc04d842e is in d_materialise_unique (fs/dcache.c:2073).
> > 2068 out_unalias:
> > 2069 d_move_locked(alias, dentry);
> > 2070 ret = alias;
> > 2071 out_err:
> > 2072 spin_unlock(&inode->i_lock);
> > 2073 if (m2)
> > 2074 mutex_unlock(m2);
> > 2075 if (m1)
> > 2076 mutex_unlock(m1);
> > 2077 return ret;
> >
> > > Thanks again for all the testing here! Its really appreciated!
>
> I just tried building 2.6.33.6 rt26 (saw it this morning on the site)
> and it does not exhibit this problem. Woohoo!

Yea. The vfs-scalability patches were dropped. Sort of a bummer, but I'm
glad that fixes it for you.

We may run into the issue again when the patches can get beaten back
into shape, but I'll try to hunt down the issue from your log output
prior to that.

Thanks again for the testing and great feedback!
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Fernando Lopez-Lezcano on
On Mon, 2010-07-12 at 20:06 -0700, Fernando Lopez-Lezcano wrote:
> > > --------
> > > BUG: unable to handle kernel NULL pointer dereference at 0000008c
> > > IP: [<c045e50a>] rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > ...
> > > Pid: 2855, comm: nautilus Not tainted
> > > 2.6.33.6-147.rt23.3.fc12.ccrma.i686.rt #3 P5K/EPU/P5K/EPU
> > > EIP: 0060:[<c045e50a>] EFLAGS: 00210246 CPU: 0
> > > EIP is at rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > > EAX: 00000078 EBX: ef45393c ECX: 00000000 EDX: 00000078
> > > ESI: ef716edc EDI: 00000000 EBP: f1977c8c ESP: f1977c88
> > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 preempt:00000000
> > > Process nautilus (pid: 2855, ti=f1976000 task=f2347130 task.ti=f1976000)
> > > Stack:
> > > ef45393c f1977c94 c0781206 f1977cc8 c04d842e 00000000 ef703e54 faadd5bc
> > > <0> f1977cdc 126bc87a ef703ddc 00000000 ef45393c ef716edc ef6f5494
> > > faafdc6c
> > > <0> f1977df8 faad9041 c3604b5c faafdc6c ef452bfc 00007e7f f5eb41e8
> > > 00000007
> > > Call Trace:
> > > [<c0781206>] ? rt_spin_unlock+0x8/0xa
> > > [<c04d842e>] ? d_materialise_unique+0x210/0x2aa
> >
> > Can you gdb list *0xc04d842e ?
>
> (gdb) list *0xc04d842e
> 0xc04d842e is in d_materialise_unique (fs/dcache.c:2073).
> 2068 out_unalias:
> 2069 d_move_locked(alias, dentry);
> 2070 ret = alias;
> 2071 out_err:
> 2072 spin_unlock(&inode->i_lock);
> 2073 if (m2)
> 2074 mutex_unlock(m2);
> 2075 if (m1)
> 2076 mutex_unlock(m1);
> 2077 return ret;
>
> > Thanks again for all the testing here! Its really appreciated!

I just tried building 2.6.33.6 rt26 (saw it this morning on the site)
and it does not exhibit this problem. Woohoo!

Thanks again!
-- Fernando


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Fernando Lopez-Lezcano on
On Wed, 2010-07-14 at 14:36 -0700, john stultz wrote:
> On Wed, 2010-07-14 at 14:32 -0700, Fernando Lopez-Lezcano wrote:
> > On Mon, 2010-07-12 at 20:06 -0700, Fernando Lopez-Lezcano wrote:
> > > > > --------
> > > > > BUG: unable to handle kernel NULL pointer dereference at 0000008c
> > > > > IP: [<c045e50a>] rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > > > ...
> > > > > Pid: 2855, comm: nautilus Not tainted
> > > > > 2.6.33.6-147.rt23.3.fc12.ccrma.i686.rt #3 P5K/EPU/P5K/EPU
> > > > > EIP: 0060:[<c045e50a>] EFLAGS: 00210246 CPU: 0
> > > > > EIP is at rt_spin_lock_fastunlock.clone.2+0x6/0x3e
> > > > > EAX: 00000078 EBX: ef45393c ECX: 00000000 EDX: 00000078
> > > > > ESI: ef716edc EDI: 00000000 EBP: f1977c8c ESP: f1977c88
> > > > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 preempt:00000000
> > > > > Process nautilus (pid: 2855, ti=f1976000 task=f2347130 task.ti=f1976000)
> > > > > Stack:
> > > > > ef45393c f1977c94 c0781206 f1977cc8 c04d842e 00000000 ef703e54 faadd5bc
> > > > > <0> f1977cdc 126bc87a ef703ddc 00000000 ef45393c ef716edc ef6f5494
> > > > > faafdc6c
> > > > > <0> f1977df8 faad9041 c3604b5c faafdc6c ef452bfc 00007e7f f5eb41e8
> > > > > 00000007
> > > > > Call Trace:
> > > > > [<c0781206>] ? rt_spin_unlock+0x8/0xa
> > > > > [<c04d842e>] ? d_materialise_unique+0x210/0x2aa
> > > >
> > > > Can you gdb list *0xc04d842e ?
> > >
> > > (gdb) list *0xc04d842e
> > > 0xc04d842e is in d_materialise_unique (fs/dcache.c:2073).
> > > 2068 out_unalias:
> > > 2069 d_move_locked(alias, dentry);
> > > 2070 ret = alias;
> > > 2071 out_err:
> > > 2072 spin_unlock(&inode->i_lock);
> > > 2073 if (m2)
> > > 2074 mutex_unlock(m2);
> > > 2075 if (m1)
> > > 2076 mutex_unlock(m1);
> > > 2077 return ret;
> > >
> > > > Thanks again for all the testing here! Its really appreciated!
> >
> > I just tried building 2.6.33.6 rt26 (saw it this morning on the site)
> > and it does not exhibit this problem. Woohoo!
>
> Yea. The vfs-scalability patches were dropped. Sort of a bummer, but I'm
> glad that fixes it for you.
>
> We may run into the issue again when the patches can get beaten back
> into shape, but I'll try to hunt down the issue from your log output
> prior to that.

Ok, let me know if you need me to test again. I have the serial cable
now! :-)

> Thanks again for the testing and great feedback!

Sure.
-- Fernando


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
First  |  Prev  | 
Pages: 1 2 3 4
Prev: Good news for you
Next: missing post?