Prev: QoS and/or fair queueing: Stateless NAT BUG
Next: Bluetooth: hidp: Add support for hidraw HIDIOCGFEATURE and HIDIOCSFEATURE
From: Felipe W Damasio on 10 Jul 2010 23:20 Hi Mr. Miller, 2010/7/10 David Miller <davem(a)davemloft.net>: > It could be corruption from elsewhere. �Those last four hex > digits (0x5d415d41) are "]A]A" in ascii, but that could just > be coincidence. What do you mean "from elsewhere"? You mean elsewhere on the network code? Since the function that had the problem was tcp_recvmsg and we're talking about a squid process, we're either talking about a typical webserver-objet response, or about about an incorrect/faulty http request from the user. Like I told Mr. Dumazet, since on the squid logs I got a: 2010/07/08 14:51:10| clientTryParseRequest: FD 6088 (187.16.240.122:2035) Invalid Request Only a second before the bug entry on syslog, I suppose that this invalid request caused the problem (more like a guess, really). If you think there's a way I can help reproduce/trigger and fix this bug, please let me know, since the production machine is down until I can ensure my bosses that this particular crash won't happen again. Thanks, Felipe Damasio -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 11 Jul 2010 01:20 On 07/10/2010 09:17 AM, Eric Dumazet wrote: > > Strange thing with your crash report is CR2 value, with unexpected value > of 000000000b388000 while RAX value is dce8dce85d415d41 > > Faulting instruction is : > > 48 83 b8 b0 00 00 00 00 cmpq $0x0,0xb0(%rax) > > So I would have expected CR2 being RAX+0xb0, but its not. > Nothing strange about it. You only get page faults and valid cr2 for canonical addresses (17 high order bits all equal). In this case rax+0xb0 is not a canonical address, so you got a general protection fault instead, with cr2 unchanged. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Felipe W Damasio on 11 Jul 2010 03:20 2010/7/11 Felipe W Damasio <felipewd(a)gmail.com>: > � The production machine has 8GB of RAM: I'm sorry, this is not right. The production machine has 16GB of RAM. Don't know if that matters regarding those proc parameters, though. Cheers, Felipe Damasio -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 11 Jul 2010 04:10 Le dimanche 11 juillet 2010 à 08:19 +0300, Avi Kivity a écrit : > On 07/10/2010 09:17 AM, Eric Dumazet wrote: > > > > Strange thing with your crash report is CR2 value, with unexpected value > > of 000000000b388000 while RAX value is dce8dce85d415d41 > > > > Faulting instruction is : > > > > 48 83 b8 b0 00 00 00 00 cmpq $0x0,0xb0(%rax) > > > > So I would have expected CR2 being RAX+0xb0, but its not. > > > > Nothing strange about it. You only get page faults and valid cr2 for > canonical addresses (17 high order bits all equal). In this case > rax+0xb0 is not a canonical address, so you got a general protection > fault instead, with cr2 unchanged. > OK, thanks Avi for this information, as I was not aware of this. So something overwrote sk->sk_prot pointer (or skb->sk pointer) with some data. tcp sockets are allocated from a dedicated kmem_cache (because of SLAB_DESTROY_RCU attribute). Their sk->sk_prot should never change in normal operation, since underlying memory cannot be reused by another object type in kernel. It should be NULL or &tcp_prot Felipe, please describe your configuration as much as possible. It might be a driver bug with with special kind of network frames. lsmod lspci -v ethtool -k eth0 ethtool -k eth1 (if applicable) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on 11 Jul 2010 04:40 Le samedi 10 juillet 2010 à 12:30 -0700, David Miller a écrit : > From: Eric Dumazet <eric.dumazet(a)gmail.com> > Date: Sat, 10 Jul 2010 08:17:29 +0200 > > > Strange thing with your crash report is CR2 value, with unexpected value > > of 000000000b388000 while RAX value is dce8dce85d415d41 > > > > Faulting instruction is : > > > > 48 83 b8 b0 00 00 00 00 cmpq $0x0,0xb0(%rax) > > > > So I would have expected CR2 being RAX+0xb0, but its not. > > It could be corruption from elsewhere. Those last four hex > digits (0x5d415d41) are "]A]A" in ascii, but that could just > be coincidence. > x86 being litle endian, string is "A]A]" followed by another "XYXY" pattern (non ASCII chars : 0xE8, 0xDC, 0xE8, 0xDC, "èÜèÜ" in ISO8859) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: QoS and/or fair queueing: Stateless NAT BUG Next: Bluetooth: hidp: Add support for hidraw HIDIOCGFEATURE and HIDIOCSFEATURE |