Prev: net: VMware virtual Ethernet NIC driver: vmxnet3
Next: net: VMware virtual Ethernet NIC driver: vmxnet3
From: Eric Dumazet on 30 Oct 2009 11:10 Eric Dumazet a �crit : > Francis Moreau a �crit : >> Hello Eric, >> >> It seems I still have a related bug, please have a look to the following oops. >> >> This happened on a 2.6.32-rc5 where your patch is included. >> >> [107304.558821] nfsd: last server has exited, flushing export cache >> [107304.558848] ------------[ cut here ]------------ >> [107304.558858] WARNING: at net/ipv4/af_inet.c:153 >> inet_sock_destruct+0x161/0x17c() >> [107304.558862] Hardware name: P5K-VM >> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd >> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode] >> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25 >> [107304.558892] Call Trace: >> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c >> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9 >> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16 >> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c >> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7 >> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21 >> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd >> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10 >> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c >> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71 >> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b >> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0 >> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f >> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc] >> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc] >> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc] >> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f >> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc] >> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc] >> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc] >> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc] >> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd] >> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd] >> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a >> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20 >> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30 >> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a >> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20 >> [107304.559191] ---[ end trace c107131f4762168c ]--- >> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state >> recovery directory >> [107304.932765] NFSD: starting 90-second grace period >> > > This oops occurring again and again with SUNRPC finally gave me the right pointer. > > David, we added two years ago memory accounting to UDP, and this changed > requirements about calling skb_free_datagram() in the right context. > > I wish we had an ASSERT_SOCK_LOCKED() debugging facility :( > > Francis, would you please test following patch ? Here is the updated patch (against linux-2.6) , in case other people are interested to test it. [PATCH take2] net: fix sk_forward_alloc corruption On UDP sockets, we must call skb_free_datagram() with socket locked, or risk sk_forward_alloc corruption. This requirement is not respected in SUNRPC. Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC Reported-by: Francis Moreau <francis.moro(a)gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com> --- include/linux/skbuff.h | 2 ++ net/core/datagram.c | 10 +++++++++- net/ipv4/udp.c | 4 +--- net/ipv6/udp.c | 4 +--- net/sunrpc/svcsock.c | 10 +++++----- 5 files changed, 18 insertions(+), 12 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index df7b23a..266878f 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1757,6 +1757,8 @@ extern int skb_copy_datagram_const_iovec(const struct sk_buff *from, int to_offset, int size); extern void skb_free_datagram(struct sock *sk, struct sk_buff *skb); +extern void skb_free_datagram_locked(struct sock *sk, + struct sk_buff *skb); extern int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags); extern __wsum skb_checksum(const struct sk_buff *skb, int offset, diff --git a/net/core/datagram.c b/net/core/datagram.c index 1c6cf3a..4ade301 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -224,6 +224,15 @@ void skb_free_datagram(struct sock *sk, struct sk_buff *skb) consume_skb(skb); sk_mem_reclaim_partial(sk); } +EXPORT_SYMBOL(skb_free_datagram); + +void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb) +{ + lock_sock(sk); + skb_free_datagram(sk, skb); + release_sock(sk); +} +EXPORT_SYMBOL(skb_free_datagram_locked); /** * skb_kill_datagram - Free a datagram skbuff forcibly @@ -752,5 +761,4 @@ unsigned int datagram_poll(struct file *file, struct socket *sock, EXPORT_SYMBOL(datagram_poll); EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec); EXPORT_SYMBOL(skb_copy_datagram_iovec); -EXPORT_SYMBOL(skb_free_datagram); EXPORT_SYMBOL(skb_recv_datagram); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index d0d436d..0fa9f70 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -999,9 +999,7 @@ try_again: err = ulen; out_free: - lock_sock(sk); - skb_free_datagram(sk, skb); - release_sock(sk); + skb_free_datagram_locked(sk, skb); out: return err; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 3a60f12..cf538ed 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -288,9 +288,7 @@ try_again: err = ulen; out_free: - lock_sock(sk); - skb_free_datagram(sk, skb); - release_sock(sk); + skb_free_datagram_locked(sk, skb); out: return err; diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index ccc5e83..1c246a4 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -111,7 +111,7 @@ static void svc_release_skb(struct svc_rqst *rqstp) rqstp->rq_xprt_ctxt = NULL; dprintk("svc: service %p, releasing skb %p\n", rqstp, skb); - skb_free_datagram(svsk->sk_sk, skb); + skb_free_datagram_locked(svsk->sk_sk, skb); } } @@ -578,7 +578,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp) "svc: received unknown control message %d/%d; " "dropping RPC reply datagram\n", cmh->cmsg_level, cmh->cmsg_type); - skb_free_datagram(svsk->sk_sk, skb); + skb_free_datagram_locked(svsk->sk_sk, skb); return 0; } @@ -588,18 +588,18 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp) if (csum_partial_copy_to_xdr(&rqstp->rq_arg, skb)) { local_bh_enable(); /* checksum error */ - skb_free_datagram(svsk->sk_sk, skb); + skb_free_datagram_locked(svsk->sk_sk, skb); return 0; } local_bh_enable(); - skb_free_datagram(svsk->sk_sk, skb); + skb_free_datagram_locked(svsk->sk_sk, skb); } else { /* we can use it in-place */ rqstp->rq_arg.head[0].iov_base = skb->data + sizeof(struct udphdr); rqstp->rq_arg.head[0].iov_len = len; if (skb_checksum_complete(skb)) { - skb_free_datagram(svsk->sk_sk, skb); + skb_free_datagram_locked(svsk->sk_sk, skb); return 0; } rqstp->rq_xprt_ctxt = skb; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: David Miller on 30 Oct 2009 15:30 From: Eric Dumazet <eric.dumazet(a)gmail.com> Date: Fri, 30 Oct 2009 16:03:53 +0100 > [PATCH take2] net: fix sk_forward_alloc corruption > > On UDP sockets, we must call skb_free_datagram() with socket locked, > or risk sk_forward_alloc corruption. This requirement is not respected > in SUNRPC. > > Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC > > Reported-by: Francis Moreau <francis.moro(a)gmail.com> > Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com> I've tentatively applied this to my net-2.6 tree, I won't push it out until we have positive testing results. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 31 Oct 2009 05:30 David Miller a �crit : > From: Eric Dumazet <eric.dumazet(a)gmail.com> > Date: Fri, 30 Oct 2009 16:03:53 +0100 > >> [PATCH take2] net: fix sk_forward_alloc corruption >> >> On UDP sockets, we must call skb_free_datagram() with socket locked, >> or risk sk_forward_alloc corruption. This requirement is not respected >> in SUNRPC. >> >> Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC >> >> Reported-by: Francis Moreau <francis.moro(a)gmail.com> >> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com> > > I've tentatively applied this to my net-2.6 tree, I won't > push it out until we have positive testing results. I tested nfs/nfsd with fillowing additional debugging patch. Without the fix (net: fix sk_forward_alloc corruption), I got warnings, while with fix applied, I got no warnings. Its probably hard to hit original bug since its a very small race window. Trace without fix : [ 226.686788] RPC: Registered udp transport module. [ 226.686791] RPC: Registered tcp transport module. [ 226.686792] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 226.851677] Installing knfsd (copyright (C) 1996 okir(a)monad.swb.de). [ 226.869381] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory [ 226.869443] NFSD: unable to find recovery directory /var/lib/nfs/v4recovery [ 226.869488] NFSD: starting 90-second grace period [ 226.870564] svc: failed to register lockdv1 RPC service (errno 97). [ 244.401237] ------------[ cut here ]------------ [ 244.401288] WARNING: at include/net/sock.h:886 sock_rfree+0x54/0x70() [ 244.401334] Hardware name: ProLiant BL460c G1 [ 244.401379] Modules linked in: nfsd lockd auth_rpcgss sunrpc exportfs bonding ipv6 [ 244.401699] Pid: 5295, comm: nfsd Not tainted 2.6.32-rc3-00920-gbdd6be3-dirty #342 [ 244.401745] Call Trace: [ 244.401784] [<c055355a>] ? printk+0x1d/0x23 [ 244.401827] [<c023c3f2>] warn_slowpath_common+0x72/0xa0 [ 244.401869] [<c04b0d54>] ? sock_rfree+0x54/0x70 [ 244.401909] [<c04b0d54>] ? sock_rfree+0x54/0x70 [ 244.401950] [<c023c43a>] warn_slowpath_null+0x1a/0x20 [ 244.401991] [<c04b0d54>] sock_rfree+0x54/0x70 [ 244.402033] [<c04b4cf5>] skb_release_head_state+0x45/0xe0 [ 244.402084] [<c04b49f0>] __kfree_skb+0x10/0x90 [ 244.402125] [<c04b4a8c>] consume_skb+0x1c/0x40 [ 244.402166] [<c04b7c62>] skb_free_datagram+0x12/0x70 [ 244.402217] [<fc9fa177>] svc_release_skb+0x37/0x60 [sunrpc] [ 244.402261] [<c026f452>] ? module_put+0x62/0xf0 [ 244.402308] [<fca05c98>] svc_send+0x28/0xc0 [sunrpc] [ 244.402356] [<fc9f8acb>] svc_process+0x22b/0x770 [sunrpc] [ 244.402416] [<fd04895c>] nfsd+0xac/0x140 [nfsd] [ 244.402484] [<fd0488b0>] ? nfsd+0x0/0x140 [nfsd] [ 244.402543] [<c0257ab4>] kthread+0x74/0x80 [ 244.402602] [<c0257a40>] ? kthread+0x0/0x80 [ 244.402656] [<c020390f>] kernel_thread_helper+0x7/0x10 [ 244.402711] ---[ end trace e202cdb1b491aa15 ]--- Debugging patch : Intent is to make sure sock is locked by user or by softirq handler, unless we are currently dismantling socket (sk_refcnt==0) To speedup this test, we could use several bits of sk_lock.owned instead of one for the user case. (one for lock owned by user, one for lowk owned by softirq, one in dismantle phase) diff --git a/include/net/sock.h b/include/net/sock.h index 55de3bd..08ecb0e 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -849,10 +849,15 @@ static inline int sk_rmem_schedule(struct sock *sk, int size) __sk_mem_schedule(sk, size, SK_MEM_RECV); } +#define sock_owned_by_user(sk) ((sk)->sk_lock.owned) +#define sock_owned(sk) ((sk)->sk_lock.owned || \ + spin_is_locked(&(sk)->sk_lock.slock)) + static inline void sk_mem_reclaim(struct sock *sk) { if (!sk_has_account(sk)) return; + WARN_ON(!(sock_owned(sk) || atomic_read(&sk->sk_refcnt) == 0)); if (sk->sk_forward_alloc >= SK_MEM_QUANTUM) __sk_mem_reclaim(sk); } @@ -861,6 +866,7 @@ static inline void sk_mem_reclaim_partial(struct sock *sk) { if (!sk_has_account(sk)) return; + WARN_ON(!(sock_owned(sk) || atomic_read(&sk->sk_refcnt) == 0)); if (sk->sk_forward_alloc > SK_MEM_QUANTUM) __sk_mem_reclaim(sk); } @@ -869,6 +875,7 @@ static inline void sk_mem_charge(struct sock *sk, int size) { if (!sk_has_account(sk)) return; + WARN_ON(!sock_owned(sk)); sk->sk_forward_alloc -= size; } @@ -876,6 +883,7 @@ static inline void sk_mem_uncharge(struct sock *sk, int size) { if (!sk_has_account(sk)) return; + WARN_ON(!sock_owned(sk)); sk->sk_forward_alloc += size; } @@ -900,7 +908,6 @@ static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) * Since ~2.3.5 it is also exclusive sleep lock serializing * accesses from user process context. */ -#define sock_owned_by_user(sk) ((sk)->sk_lock.owned) /* * Macro so as to not evaluate some arguments when -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Francis Moreau on 6 Nov 2009 03:50 On Fri, Oct 30, 2009 at 4:03 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote: [...] > > [PATCH take2] net: fix sk_forward_alloc corruption > With this patch applied 4 days ago, I haven't got any oops so far. -- Francis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 6 Nov 2009 03:50 Francis Moreau a �crit : > On Fri, Oct 30, 2009 at 4:03 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote: > > [...] > >> [PATCH take2] net: fix sk_forward_alloc corruption >> > > With this patch applied 4 days ago, I haven't got any oops so far. Thanks a lot for testing and report Francis, I believe David will push it for stable (if not already done) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: net: VMware virtual Ethernet NIC driver: vmxnet3 Next: net: VMware virtual Ethernet NIC driver: vmxnet3 |