Prev: AMD-Vi: Enabling IOMMU at 0000:00:00.2 cap 0x40 BUG: unable to handle kernel NULL pointer dereference at 0000000000000198
Next: [PATCH] nfs: Add "lookupcache" to displayed mount options
From: Joerg Roedel on 10 Aug 2010 14:10 On Tue, Aug 10, 2010 at 06:57:45PM +0200, Sander Eikelenboom wrote: > The requested info is attached. > So that would mean a bios problem ? (those are not on my wishlist :-p) Yeah, looks like a BIOS problem. But the driver should handle that without crashing the system, so there is a bug in the driver too. Problem is: AMD-Vi: DEV_ALIAS_RANGE devid: 0a:01.0 flags: 00 devid_to: 0a:00.0 AMD-Vi: DEV_RANGE_END devid: 0a:1f.7 This means that PCI devices from 0a:01.0 to 0a:1f.7 may use their own device-id or 0a:00.0. But a device which id 0a:00.0 is not present in the system. From the lspci output this looks like your USB3 controler should alias to 09:00.0. I prepare a patch for you to fix the crash but I can't guarantee that your USB3 controler will work afterwards. If you see IO-Page-Faults please report them to me. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 10 Aug 2010 16:30 On Tue, Aug 10, 2010 at 08:05:14PM +0200, Sander Eikelenboom wrote: > Could you also provide a perhaps more specific message what is wrong > with the bios, that i could forward to MSI, in the hope it will reach > the bios engineers someday ? :-) Lets first prove that my theory is right before contacting MSI directly. Can you try the attached patch? it should fix the boot-crash. When the system booted successfully please try some USB device (make sure it uses the seperate usb-controler, I guess the seperate device is responsible for USB 3, so try to plug a device into one of your USB 3 ports). If you finished that please send me whether it worked or not and the full dmesg output of the system. Joerg
From: Joerg Roedel on 10 Aug 2010 16:50 Hi Sander, On Tue, Aug 10, 2010 at 10:36:35PM +0200, Sander Eikelenboom wrote: > Errr which seperate usb controller ? .. it has actually: > - 1 pci-e usb 2.0 controller > - 2 pci-e usb 3.0 controller (one of which includes a sata controller as well) The devices should be attached to this controler: 0a:01.0 USB Controller [0c03]: NEC Corporation USB [1033:0035] (rev 43) (prog-if 10 [OHCI]) 0a:01.1 USB Controller [0c03]: NEC Corporation USB [1033:0035] (rev 43) (prog-if 10 [OHCI]) 0a:01.2 USB Controller [0c03]: NEC Corporation USB 2.0 [1033:00e0] (rev 04) (prog-if 20 [EHCI]) The PCI devices associated with that controler alias to 0a:00.0 which does not exist in your system (hence the crash). And the fact that these devices have an alias makes me believe that the BIOS detects them as legacy PCI devices. PCI-e does typically not has aliases. Can you send lcpi -t output to see to which upstream bridge these devices are connected to? Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 10 Aug 2010 17:30 On Tue, Aug 10, 2010 at 10:57:26PM +0200, Sander Eikelenboom wrote: > Hmmm the fun part seems to be .. that the usb devices on that usb2 > controller seemed to work fine on Xen. Hmm, thats weird. In this case these devices probably do not alias at all. But lets wait for the results when you test my patch. > +-0a.0-[0000:09-0a]----00.0-[0000:0a]--+-01.0 > | +-01.1 > | \-01.2 Yeah, device 09:00.0 is a PCIe-to-PCI bridge and the addtional USB controlers are behind that bridge as legacy PCI devices. Thats why the BIOS sets up the alias-entry. It should set up 09:00.0 instead of 0a:00.0 to make things work correctly. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 10 Aug 2010 18:10
Ok, On Tue, Aug 10, 2010 at 11:36:59PM +0200, Sander Eikelenboom wrote: > It boots now, dmesg attached. AMD-Vi: Event logged [IO_PAGE_FAULT device=0a:00.0 domain=0x0000 address=0x0000000000001080 flags=0x0070] So it indeed uses 0a:00.0 as the device id. Thats weird but states that the BIOS is actually ok. I need to fix that in the driver. Thanks, Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |