Prev: isdn: Cleanup Sections in PCMCIA driver teles
Next: [Patch 1/1 v2] pata_via: HDD of VT6410/6415/6330 cannot be detected issue
From: Justin Piszcz on 30 Mar 2010 14:10 On Tue, 30 Mar 2010, Alan Stern wrote: > On Tue, 30 Mar 2010, Justin Piszcz wrote: > > Also, I'd like to see the contents of your /proc/interrupts. It looks > like the OHCI controller shares an IRQ line with some other device. Hi, you are correct: $ cat /proc/interrupts CPU0 CPU1 0: 127 32 IO-APIC-edge timer 1: 0 2 IO-APIC-edge i8042 7: 1 0 IO-APIC-edge 9: 0 0 IO-APIC-fasteoi acpi 20: 0 3 IO-APIC-fasteoi ehci_hcd:usb1 22: 0 0 IO-APIC-fasteoi sata_nv 23: 216 134543 IO-APIC-fasteoi sata_nv, ohci_hcd:usb2 27: 0 68 PCI-MSI-edge hda_intel 28: 4722 1583395 PCI-MSI-edge eth0 NMI: 0 0 Non-maskable interrupts LOC: 5414110 5415173 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 766744 123073 Rescheduling interrupts CAL: 113 25 Function call interrupts TLB: 1014 1029 TLB shootdowns THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 19 19 Machine check polls ERR: 1 MIS: 0 $ > > Well, I'm making progress. Below is a new debugging patch to try in > place of the first one. This time the dmesg log alone will be > sufficient, no need for a usbmon trace. And the output should be a lot > smaller, since the new patch doesn't print something every time an > interrupt occurs, but rather only when you unplug the mouse. > > In fact, you might try unplugging the mouse while it still works and > then plugging it back in. The difference between the debugging > messages while everything is working and the same thing after the mouse > fails should be informative. Ok, I can try this as well. > > (By the way, these tests are meant to find out why your Xorg and khubd > processes hang when the mouse fails, not for finding the original cause > behind the mouse failure. That can be addressed later.) This appears to occur only AFTER the mouse locks up, I do ctrl-alt-f1 and then X freezes up after that. > Some of those reports indicate that a BIOS update could fix the > problem. Have you checked your BIOS version? The BIOS is outdated, I will create a Windows Boot CD and flash the BIOS to the latest version. The hardware in question is an Optiplex 740. It is running an older firmware version.. The latest firmware is from late 2009 (2.2.4): O740-224.EXE, but you cannot flash it in Linux so will test this tomorrow, flash the latest bios, apply your latest patch, see if it recurs. I did check the DIFF's for the Dell BIOS updates, none mention a USB problem like the one in the kernel bug post (earlier Dell system). Justin. > > Alan Stern > > > > Index: usb-2.6/drivers/usb/host/ohci-hcd.c > =================================================================== > --- usb-2.6.orig/drivers/usb/host/ohci-hcd.c > +++ usb-2.6/drivers/usb/host/ohci-hcd.c > @@ -292,6 +292,8 @@ static int ohci_urb_dequeue(struct usb_h > if (urb_priv) { > if (urb_priv->ed->state == ED_OPER) > start_ed_unlink (ohci, urb_priv->ed); > + ohci_info(ohci, "start unlink urb %p, ed %p tick %u\n", > + urb, urb_priv->ed, urb_priv->ed->tick); > } > } else { > /* > @@ -324,6 +326,9 @@ ohci_endpoint_disable (struct usb_hcd *h > > if (!ed) > return; > + ohci_info(ohci, "disable ed %p (#%02x) state %d%s\n", > + ed, ep->desc.bEndpointAddress, ed->state, > + list_empty(&ed->td_list) ? "" : " (has tds)"); > > rescan: > spin_lock_irqsave (&ohci->lock, flags); > Index: usb-2.6/drivers/usb/host/ohci-q.c > =================================================================== > --- usb-2.6.orig/drivers/usb/host/ohci-q.c > +++ usb-2.6/drivers/usb/host/ohci-q.c > @@ -912,6 +912,9 @@ rescan_all: > * frame counter wraps and EDs with partially retired TDs > */ > if (likely (HC_IS_RUNNING(ohci_to_hcd(ohci)->state))) { > + ohci_info(ohci, "finish_unlinks: tick %u, ed %p %u, %d\n", > + tick, ed, ed->tick, > + tick_before(tick, ed->tick)); > if (tick_before (tick, ed->tick)) { > skip_ed: > last = &ed->ed_next; > @@ -928,6 +931,8 @@ skip_ed: > TD_MASK; > > /* INTR_WDH may need to clean up first */ > + ohci_info(ohci, "dma %llx head %x\n", > + (unsigned long long) td->td_dma, head); > if (td->td_dma != head) { > if (ed == ohci->ed_to_check) > ohci->ed_to_check = NULL; > @@ -990,6 +995,8 @@ rescan_this: > /* HC may have partly processed this TD */ > td_done (ohci, urb, td); > urb_priv->td_cnt++; > + ohci_info(ohci, "td_cnt %d length %d\n", > + urb_priv->td_cnt, urb_priv->length); > > /* if URB is done, clean up */ > if (urb_priv->td_cnt == urb_priv->length) { > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Justin Piszcz on 31 Mar 2010 07:10 On Wed, 31 Mar 2010, Tiago Vignatti wrote: > Justin Piszcz wrote: >> >> >> On Thu, 25 Mar 2010, Justin Piszcz wrote: >> >>> >>> >>> On Thu, 25 Mar 2010, Justin Piszcz wrote: >>> >>> The same problem has been reported by another person, he says his entire >>> system freezes, which, it appears to do unless you can SSH into the box: >>> http://www.openoffice.org/issues/show_bug.cgi?id=76797 >>> >>> Look at his lspci listing. >>> james(a)dv6105us:~$ lspci >>> 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2) >>> 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2) >>> 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2) >>> >>> Here is mine: >>> $ lspci >>> 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2) >>> 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2) >>> 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2) >>> 00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2) >>> >>> Looks like the bug may be in the USB subsystem for this chipset. >>> >>> Justin. >>> >>> >> >> Hi, >> >> And there it goes again *LOCK* >> root 2190 0.5 1.5 37832 31424 tty7 Ds+ 09:00 0:12 /usr/bin/X >> :0 vt7 -nolisten tcp -auth /var/lib/xdm/authdir/authfiles/A:0-N5V00o > > running X server with -nosilk helps something? Hi, After the BIOS update and request from Alan, if it *STILL* persists, I can try this, thanks. # grep -i silken Xorg.0.log* Xorg.0.log:(==) NV(0): Silken mouse enabled Xorg.0.log.old:(==) NV(0): Silken mouse enabled Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Justin Piszcz on 31 Mar 2010 14:30
On Wed, 31 Mar 2010, Justin Piszcz wrote: Hi, With the latest BIOS, I used the system today for 1-2 hours and could not get it to repeat the crash (I had full debugging enabled) and Alan's patch applied, I will continue to test because when you want it to crash, it usually does not, but the BIOS update may have fixed it. Latest BIOS for Dell Optiplex 740: 2.2.4 (upgraded today), it was upgraded from 2.0.12. Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |