Prev: [PATCH] fs: Fix warning: 'dirent' is used uninitialized in this function
Next: [PATCH] Add intel drm blacklist to intel_opregion_present detect
From: Zbigniew Luszpinski on 10 Jul 2010 19:40 Hello, long history short: the io_apic2.patch provides two kernel parameters: nofasteoiapic - replaces fasteoi handler with level one for all fasteoi interrupts. nofasteoiapic=<list of irqs numbers> - replaces fasteoi handler with level for given irqs. This parameter does not work yet. I made mistake in this parameter code I can not find. why needed: This patch with nofasteoiapic parameter activated improves ohci stability by 80% for middle speed usb devices on Nvidia nForce MCP78S chipset (10de:077b, 10de:077d usb ohci controllers). Without the patch any usb 1.1 device will work for few minutes and hang after random time with timeout - usb device is not responding. It will not work with fast speed devices like usb audio - they will keep hanging. Only Linux has hanging ohci. Windows XP does not. So this is software incompatibility. What can be done and I can not do: -find better solution to have usb ohci stable 100% on all usb devices without changing fasteoi to level. -add autodetection to apply patch only for 10de:077b, 10de:077d interrupt handlers. At interrupt setup code Linux does not know which device which interrupt has so it is hard/impossible to do autodetection to apply the patch only for devices which needs it. -find bug in nofasteoiapic=<list of irqs numbers> procedure. -do not use interrupts for ohci - use i/o registers polling This task is for someone brave and skilled here. I do not feel powerful enough to handle these tasks. I barely made this attached patch. If you have any suggestion or pieces of code I could test (experimental fixes which may help or debug/diagnose aids) please send them to me. Especially I would like to test code which will use polling instead of interrupts for ohci only. I reported this bug to Nvidia, they reproduced it and confirmed it's existence. Level interrupt handler improves ohci stability. Unfortunately they also do not know so far how to fix this. This mailing list is last hope. If nothing can be done we should blacklist these mcp78s ohci controllers as broken to avoid people reporting all usb devices as broken when actually ohci controller breaks everything. ---- full history: All Nvidia MCP78S family chipsets (nForce7xx, 8200, 9x00) have probably silicon bug which causes integrated usb ohci controllers: 10de:077b, 10de:077d to hang on Linux only. WindowsXP SP3 is not affected - even on clean install with bare windows CD only - without external drivers. I'm very curious how they have done that only Linux crashes. Oldest tested kernel 2.6.18 from RHEL5, the newest: 2.6.34.1. The ohci hang moment depends on usb load - the bigger and more constant transfer the sooner the hang will happen. Let's divide usb 1.1 devices: idle - when usb devices are connected but do nothing - rock solid no crash. slow - usb keyboard/mouse - never hangs. Usb mouse can hang ohci if waving/moving mouse like crazy. Normal use no hang. medium speed - usb adsl modem 1Mbit ISP subscription. Without patch ohci hangs after few minutes of use. Checking several rss channels for news hangs ohci. With patch it does not. However opening 63 tabs in firefox at once will hang ohci with patch enabled. Without the patch connecting usb pendrive/hdd will hang ohci on plugin or soon after. With patch enabled no hang. fast - usb fm radio using alsa usb audio as transmission way: 16bit 96kHz stereo stream. Always hangs in less than 2 minutes no matter if patch is enabled or not. The same goes to IrDA usb dongle 4Mbit Only noapic kernel boot parameter makes it stable 90% of time. I checked acpi tables and they are clean. So no Linux trap. The bug exist not only on my mainboard but all from different manufacturers. All these mainboards with this bug has only one in common: Nvidia MCP78S chipset. So this must be silicon bug in chipset. After playing with kernel boot parameters I found that noapic or acpi=noirq parameters workarounds the bug in 95%. acpi=noirq just disables APIC interrupt controller so does the same as noapic. To fix this bug on Linux we have to make Linux Windows XP compatible. I made first step with the patch included. Linux by default uses fasteoi interrupt handler. WindowsXP level handler. So Linux when forced by patch to use level interrupt handler have ohci stable by 80% of the time. In noapic mode it is 90% stable. noapic solution is bad: limits CPU to 1 core only, no 100% stable ohci :( nofasteoi parameter provided by patch is better: 80% stability, all cpu cores active but usb audio hangs and stability of other devices is weak. My previous mainboard: Nvidia MCP51 chipset based worked excellent. After replacing it with Nvidia MCP78S chipset based mainboard usb ohci bug appeared. List of hardware used: previous mainboard: Asus A8N-VM CSM (MCP51 chipset works excellent) current mainboard: Asrock K10N78FullHD-hSLI rev. 3.0 with current bios (broken ohci usb only on Linux everything else excellent). usb devices used: pendrive: Kingston 8 GB usb hdd: Seagate 80GB SATA1 in ICY BOX usb case usb irda dongle: Stir4200 module/chipset/Linux driver usb adsl modems: Speedtouch 330 and ZXDSL852 unicorn2 chipset/Linux driver usb radio: Silabs fm radio usb: radio_usb_si470x linux driver usb printer: hp deskjet 5940 usb keyboard: genius usb mouse: Logitech pilot mouse and logitech trackman trackball and pixart mouse my bug report: https://bugzilla.kernel.org/show_bug.cgi?id=13405 (now I do not think this is acpi problem) list of attached files: io_apic2.patch - copy it to /usr/src/linux-2.6.34.1/arch/x86/kernel/apic and do patch -p0 < io_apic2.patch after kernel compilation boot new kernel with nofasteoiapic parameter added. ohcifail.tar.gz - dumps of dmesg, interrupts, /proc and /sys important files. have a nice day, Zbigniew Luszpinski |