From: Yinghai Lu on 3 Aug 2010 04:10 On 08/03/2010 12:19 AM, Yinghai Lu wrote: > On 08/02/2010 08:13 PM, Eric W. Biederman wrote: >> Yinghai Lu <yinghai(a)kernel.org> writes: >> >>> On 08/02/2010 06:32 PM, Yinghai Lu wrote: >>>> On 08/02/2010 04:17 PM, Dave Airlie wrote: >>>>>> >>>>>> the kernel is using mptable, and the system have mcp55, so how come >>>>>> with irq 35? >>>>>> assume we should only have ioapic irq 0 - 23 ... >>>>>> >>>>>> Can you send out boot log with "debug apic=debug pci=routeirq" with >>>>>> 2.6.32 and 2.6.35? >>>>> >>>>> Okay el6log is from a RHEL6 2.6.32 kernel, but it should give a good >>>>> baseline, the 2.6.35 oops even earlier with all those options and is >>>>> in the second attachment. >>>> >>> >> >> This patch is wrong and there is no reason to even suspect it will >> affect this problem. At best this patch will trade one set of bugs >> for another because at least on some platforms we always did something >> like this. Having an irq 35 is odd and certainly a result of recent >> changes, but in this case it doesn't look like it has anything to do >> with the problem. >> >> Nacked-by: "Eric W. Biederman" <ebiederm(a)xmission.com> >> >>> please use this one instead..., forget to run quilt refresh before sending it. >>> >>> [PATCH -v2] x86: fix pin_2_irq mapping >>> >>> We should not twist gsi to irq mapping if acpi is not used. >>> >>> -v2 remove not used irq_to_gsi() >>> >>> Signed-off-by: Yinghai Lu <yinghai(a)kernel.org> >>> >>> --- >>> arch/x86/include/asm/io_apic.h | 10 ++++++++++ >>> arch/x86/kernel/acpi/boot.c | 4 ++-- >>> arch/x86/kernel/apic/io_apic.c | 5 +---- >>> 3 files changed, 13 insertions(+), 6 deletions(-) >>> >>> Index: linux-2.6/arch/x86/include/asm/io_apic.h >>> =================================================================== >>> --- linux-2.6.orig/arch/x86/include/asm/io_apic.h >>> +++ linux-2.6/arch/x86/include/asm/io_apic.h >>> @@ -185,6 +185,16 @@ int mp_find_ioapic_pin(int ioapic, u32 g >>> void __init mp_register_ioapic(int id, u32 address, u32 gsi_base); >>> extern void __init pre_init_apic_IRQ0(void); >>> >>> +#ifdef CONFIG_ACPI >>> +unsigned int gsi_to_irq(unsigned int gsi); >>> +u32 irq_to_gsi(int irq); >>> +#else >>> +static inline unsigned int gsi_to_irq(unsigned int gsi) >>> +{ >>> + return gsi; >>> +} >>> +#endif >>> + >>> #else /* !CONFIG_X86_IO_APIC */ >>> >>> #define io_apic_assign_pci_irqs 0 >>> Index: linux-2.6/arch/x86/kernel/acpi/boot.c >>> =================================================================== >>> --- linux-2.6.orig/arch/x86/kernel/acpi/boot.c >>> +++ linux-2.6/arch/x86/kernel/acpi/boot.c >>> @@ -100,7 +100,7 @@ static u32 isa_irq_to_gsi[NR_IRQS_LEGACY >>> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 >>> }; >>> >>> -static unsigned int gsi_to_irq(unsigned int gsi) >>> +unsigned int gsi_to_irq(unsigned int gsi) >>> { >>> unsigned int irq = gsi + NR_IRQS_LEGACY; >>> unsigned int i; >>> @@ -123,7 +123,7 @@ static unsigned int gsi_to_irq(unsigned >>> return irq; >>> } >>> >>> -static u32 irq_to_gsi(int irq) >>> +u32 irq_to_gsi(int irq) >>> { >>> unsigned int gsi; >>> >>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c >>> =================================================================== >>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c >>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c >>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic, >>> } else { >>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin; >>> >>> - if (gsi >= NR_IRQS_LEGACY) >>> - irq = gsi; >>> - else >>> - irq = gsi_top + gsi; >>> + irq = gsi_to_irq(gsi); >>> } >>> >>> #ifdef CONFIG_X86_32 > > what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi? > just tried those blind shifting gsi cause kernel with acpi crash in virtual box. [ 5.536000] querying PCI -> IRQ mapping bus:0, slot:11, pin:0. [ 5.540000] ehci_hcd 0000:00:0b.0: can't find IRQ for PCI INT A; probably buggy MP table [ and on kvm it got: [ 4.352280] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k6-NAPI [ 4.356012] e1000: Copyright (c) 1999-2006 Intel Corporation. [ 4.360120] querying PCI -> IRQ mapping bus:0, slot:3, pin:0. [ 4.364006] PCI BIOS passed nonexistent PCI bus 0! [ 4.368007] e1000 0000:00:03.0: can't find IRQ for PCI INT A; probably buggy MP table [ 4.372049] e1000 0000:00:03.0: setting latency timer to 64 Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on 3 Aug 2010 04:30 Yinghai Lu <yinghai(a)kernel.org> writes: > On 08/03/2010 12:19 AM, Yinghai Lu wrote: >> On 08/02/2010 08:13 PM, Eric W. Biederman wrote: >>> Yinghai Lu <yinghai(a)kernel.org> writes: >>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c >>>> =================================================================== >>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c >>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c >>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic, >>>> } else { >>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin; >>>> >>>> - if (gsi >= NR_IRQS_LEGACY) >>>> - irq = gsi; >>>> - else >>>> - irq = gsi_top + gsi; >>>> + irq = gsi_to_irq(gsi); >>>> } >>>> >>>> #ifdef CONFIG_X86_32 >> >> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi? >> > > just tried those blind shifting gsi cause kernel with acpi crash in virtual box. What configuration did you try and had problems with? > [ 5.536000] querying PCI -> IRQ mapping bus:0, slot:11, pin:0. > [ 5.540000] ehci_hcd 0000:00:0b.0: can't find IRQ for PCI INT A; probably buggy MP table > [ I don't have a clue what the mpptable looks like in virtual box. My guess is that it is buggy and untested like so many mptables these days. > and on kvm it got: > [ 4.352280] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k6-NAPI > [ 4.356012] e1000: Copyright (c) 1999-2006 Intel Corporation. > [ 4.360120] querying PCI -> IRQ mapping bus:0, slot:3, pin:0. > [ 4.364006] PCI BIOS passed nonexistent PCI bus 0! > [ 4.368007] e1000 0000:00:03.0: can't find IRQ for PCI INT A; probably buggy MP table > [ 4.372049] e1000 0000:00:03.0: setting latency timer to 64 This example failed because mpparse said bus 0 was ISA. Which is a pretty bizarre thing to do, especially when bus 0 is pretty clearly PCI. That does sound like a buggy MP table. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on 3 Aug 2010 05:00 Yinghai Lu <yinghai(a)kernel.org> writes: > On 08/03/2010 01:00 AM, Eric W. Biederman wrote: >> Yinghai Lu <yinghai(a)kernel.org> writes: >> >>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>> =================================================================== >>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c >>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic, >>>>> } else { >>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin; >>>>> >>>>> - if (gsi >= NR_IRQS_LEGACY) >>>>> - irq = gsi; >>>>> - else >>>>> - irq = gsi_top + gsi; >>>>> + irq = gsi_to_irq(gsi); >>>>> } >>>>> >>>>> #ifdef CONFIG_X86_32 >>> >>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi? >> >> Because it is only convention that when mptables are used that the >> first apic pins 0-15 are the ISA irqs. This thread witnessed and a >> pci irq that came in pin < 16 that was not an ISA irq. The truly rare >> and exotic case would be for the ISA irqs to be outside the first 16 >> ioapic pins but the es7000 did exactly that. > > nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23. > > if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin. Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly shared level triggered PCI irqs. Unfortunately there are strange boards like the es7000 where pins > 16 are ISA irqs. The other thing that is gained by having pin_2_irq always remap pins < 16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere that assume irq < 16 is an ISA irq. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on 3 Aug 2010 05:10 On 08/03/2010 01:56 AM, Eric W. Biederman wrote: > Yinghai Lu <yinghai(a)kernel.org> writes: > >> On 08/03/2010 01:00 AM, Eric W. Biederman wrote: >>> Yinghai Lu <yinghai(a)kernel.org> writes: >>> >>>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>>> =================================================================== >>>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c >>>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic, >>>>>> } else { >>>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin; >>>>>> >>>>>> - if (gsi >= NR_IRQS_LEGACY) >>>>>> - irq = gsi; >>>>>> - else >>>>>> - irq = gsi_top + gsi; >>>>>> + irq = gsi_to_irq(gsi); >>>>>> } >>>>>> >>>>>> #ifdef CONFIG_X86_32 >>>> >>>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi? >>> >>> Because it is only convention that when mptables are used that the >>> first apic pins 0-15 are the ISA irqs. This thread witnessed and a >>> pci irq that came in pin < 16 that was not an ISA irq. The truly rare >>> and exotic case would be for the ISA irqs to be outside the first 16 >>> ioapic pins but the es7000 did exactly that. >> >> nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23. >> >> if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin. > > Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly > shared level triggered PCI irqs. Unfortunately there are strange > boards like the es7000 where pins > 16 are ISA irqs. > > The other thing that is gained by having pin_2_irq always remap pins < > 16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere > that assume irq < 16 is an ISA irq. how about this one ? --- arch/x86/kernel/apic/io_apic.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/io_apic.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c +++ linux-2.6/arch/x86/kernel/apic/io_apic.c @@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx) return MPBIOS_trigger(idx); } +static int shared_with_legacy(int apic, int pin) +{ + int i; + + for (i = 0; i < mp_irq_entries; i++) { + int bus = mp_irqs[i].srcbus; + + if (!test_bit(bus, mp_bus_not_pci)) + continue; + + if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic) + continue; + + if (mp_irqs[i].dstirq != pin) + continue; + + return mp_irqs[i].srcbusirq; + } + + return -1; +} + static int pin_2_irq(int idx, int apic, int pin) { int irq; @@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic, } else { u32 gsi = mp_gsi_routing[apic].gsi_base + pin; - if (gsi >= NR_IRQS_LEGACY) + if (gsi >= NR_IRQS_LEGACY) { irq = gsi; - else - irq = gsi_top + gsi; + } else { + irq = shared_with_legacy(apic, pin); + if (irq < 0) + irq = gsi_top + gsi; + } } #ifdef CONFIG_X86_32 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on 3 Aug 2010 05:20
Yinghai Lu <yinghai(a)kernel.org> writes: > On 08/03/2010 01:56 AM, Eric W. Biederman wrote: >> Yinghai Lu <yinghai(a)kernel.org> writes: >> >>> On 08/03/2010 01:00 AM, Eric W. Biederman wrote: >>>> Yinghai Lu <yinghai(a)kernel.org> writes: >>>> >>>>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>>>> =================================================================== >>>>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c >>>>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c >>>>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic, >>>>>>> } else { >>>>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin; >>>>>>> >>>>>>> - if (gsi >= NR_IRQS_LEGACY) >>>>>>> - irq = gsi; >>>>>>> - else >>>>>>> - irq = gsi_top + gsi; >>>>>>> + irq = gsi_to_irq(gsi); >>>>>>> } >>>>>>> >>>>>>> #ifdef CONFIG_X86_32 >>>>> >>>>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi? >>>> >>>> Because it is only convention that when mptables are used that the >>>> first apic pins 0-15 are the ISA irqs. This thread witnessed and a >>>> pci irq that came in pin < 16 that was not an ISA irq. The truly rare >>>> and exotic case would be for the ISA irqs to be outside the first 16 >>>> ioapic pins but the es7000 did exactly that. >>> >>> nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23. >>> >>> if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin. >> >> Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly >> shared level triggered PCI irqs. Unfortunately there are strange >> boards like the es7000 where pins > 16 are ISA irqs. >> >> The other thing that is gained by having pin_2_irq always remap pins < >> 16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere >> that assume irq < 16 is an ISA irq. > > how about this one ? You can't share an edge triggered ISA irq, it isn't really physically possible. So I don't see how this extra complexity will change anything. Eric > --- > arch/x86/kernel/apic/io_apic.c | 31 ++++++++++++++++++++++++++++--- > 1 file changed, 28 insertions(+), 3 deletions(-) > > Index: linux-2.6/arch/x86/kernel/apic/io_apic.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c > +++ linux-2.6/arch/x86/kernel/apic/io_apic.c > @@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx) > return MPBIOS_trigger(idx); > } > > +static int shared_with_legacy(int apic, int pin) > +{ > + int i; > + > + for (i = 0; i < mp_irq_entries; i++) { > + int bus = mp_irqs[i].srcbus; > + > + if (!test_bit(bus, mp_bus_not_pci)) > + continue; > + > + if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic) > + continue; > + > + if (mp_irqs[i].dstirq != pin) > + continue; > + > + return mp_irqs[i].srcbusirq; > + } > + > + return -1; > +} > + > static int pin_2_irq(int idx, int apic, int pin) > { > int irq; > @@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic, > } else { > u32 gsi = mp_gsi_routing[apic].gsi_base + pin; > > - if (gsi >= NR_IRQS_LEGACY) > + if (gsi >= NR_IRQS_LEGACY) { > irq = gsi; > - else > - irq = gsi_top + gsi; > + } else { > + irq = shared_with_legacy(apic, pin); > + if (irq < 0) > + irq = gsi_top + gsi; > + } > } > > #ifdef CONFIG_X86_32 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |