From: Jaswinder Singh Rajput on 11 May 2010 11:10 Hello, With latest git kernel, I am getting following DRM error and not getting XWindows : [ 45.269075] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [ 45.269111] ------------[ cut here ]------------ [ 45.269139] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e() [ 45.269150] Hardware name: Aspire one [ 45.269158] Modules linked in: nf_conntrack_ftp ath9k ath9k_common battery ath9k_hw [last unloaded: scsi_wait_scan] [ 45.269198] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #6 [ 45.269208] Call Trace: [ 45.269231] [<c1030ecb>] warn_slowpath_common+0x65/0x7c [ 45.269249] [<c108ce5d>] ? debug_kmap_atomic+0xa9/0x11e [ 45.269267] [<c1030eef>] warn_slowpath_null+0xd/0x10 [ 45.269284] [<c108ce5d>] debug_kmap_atomic+0xa9/0x11e [ 45.269304] [<c10207c9>] kmap_atomic_prot+0x4d/0xb2 [ 45.269321] [<c102083c>] kmap_atomic+0xe/0x10 [ 45.269341] [<c11f7d64>] i915_error_object_create+0xea/0x14f [ 45.269359] [<c11f8132>] i915_handle_error+0x369/0x868 [ 45.269380] [<c11f86d0>] i915_hangcheck_elapsed+0x9f/0xdf [ 45.269399] [<c103ab6e>] run_timer_softirq+0x1c9/0x269 [ 45.269417] [<c11f8631>] ? i915_hangcheck_elapsed+0x0/0xdf [ 45.269435] [<c1035b7b>] __do_softirq+0xc6/0x186 [ 45.269451] [<c1035c61>] do_softirq+0x26/0x2b [ 45.269466] [<c1035dd2>] irq_exit+0x29/0x66 [ 45.269484] [<c101681f>] smp_apic_timer_interrupt+0x6e/0x7c [ 45.269504] [<c141f826>] apic_timer_interrupt+0x2a/0x30 [ 45.269524] [<c104007b>] ? ftrace_raw_event_signal_generate+0x6d/0xd4 [ 45.269542] [<c11bed9d>] ? acpi_idle_enter_simple+0x13b/0x168 [ 45.269563] [<c12dd2b9>] cpuidle_idle_call+0x6b/0xda [ 45.269580] [<c1001a3c>] cpu_idle+0x44/0x74 [ 45.269598] [<c141a041>] start_secondary+0x1b2/0x1b7 [ 45.269612] ---[ end trace ce01d7ca0ae214f4 ]--- [ 45.269631] ------------[ cut here ]------------ [ 45.269647] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e() [ 45.269657] Hardware name: Aspire one [ 45.269665] Modules linked in: nf_conntrack_ftp ath9k ath9k_common battery ath9k_hw [last unloaded: scsi_wait_scan] [ 45.269700] Pid: 0, comm: swapper Tainted: G W 2.6.34-rc7-netbook #6 [ 45.269710] Call Trace: [ 45.269726] [<c1030ecb>] warn_slowpath_common+0x65/0x7c [ 45.269743] [<c108ce5d>] ? debug_kmap_atomic+0xa9/0x11e [ 45.269760] [<c1030eef>] warn_slowpath_null+0xd/0x10 [ 45.269777] [<c108ce5d>] debug_kmap_atomic+0xa9/0x11e [ 45.269795] [<c10207c9>] kmap_atomic_prot+0x4d/0xb2 [ 45.269812] [<c102083c>] kmap_atomic+0xe/0x10 [ 45.269829] [<c11f7d64>] i915_error_object_create+0xea/0x14f [ 45.269848] [<c11f8132>] i915_handle_error+0x369/0x868 [ 45.269868] [<c11f86d0>] i915_hangcheck_elapsed+0x9f/0xdf [ 45.269885] [<c103ab6e>] run_timer_softirq+0x1c9/0x269 [ 45.269903] [<c11f8631>] ? i915_hangcheck_elapsed+0x0/0xdf [ 45.269920] [<c1035b7b>] __do_softirq+0xc6/0x186 [ 45.269937] [<c1035c61>] do_softirq+0x26/0x2b [ 45.269952] [<c1035dd2>] irq_exit+0x29/0x66 [ 45.269968] [<c101681f>] smp_apic_timer_interrupt+0x6e/0x7c [ 45.269985] [<c141f826>] apic_timer_interrupt+0x2a/0x30 [ 45.270004] [<c104007b>] ? ftrace_raw_event_signal_generate+0x6d/0xd4 [ 45.270051] [<c11bed9d>] ? acpi_idle_enter_simple+0x13b/0x168 [ 45.270071] [<c12dd2b9>] cpuidle_idle_call+0x6b/0xda [ 45.270087] [<c1001a3c>] cpu_idle+0x44/0x74 [ 45.270104] [<c141a041>] start_secondary+0x1b2/0x1b7 [ 45.270117] ---[ end trace ce01d7ca0ae214f5 ]--- [ 45.270135] ------------[ cut here ]------------ dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7.txt ..config : http://userweb.kernel.org/~jaswinder/acer_netbook/config_2634-rc7.txt How can I fix these errors. Thanks, -- Jaswinder Singh. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Chris Wilson on 11 May 2010 12:20 On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote: > Hello, > > With latest git kernel, I am getting following DRM error and not > getting XWindows : [snip] Hmm, there are still patches for capturing error state that haven't gone upstream, shame on me. That error is a secondary issue to the GPU hang that is being reported. If it is a regression caused by a kernel update it would be very useful if you could bisect to the erroneous commit. -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jaswinder Singh Rajput on 11 May 2010 13:40 Hello Chris, On Tue, May 11, 2010 at 9:40 PM, Chris Wilson <chris(a)chris-wilson.co.uk> wrote: > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote: >> Hello, >> >> With latest git kernel, I am getting following DRM error and not >> getting XWindows : > > [snip] > > Hmm, there are still patches for capturing error state that haven't gone > upstream, shame on me. > > That error is a secondary issue to the GPU hang that is being reported. If > it is a regression caused by a kernel update it would be very useful if > you could bisect to the erroneous commit. > Earlier I was using Moblin, I switched to Fedora and start getting this error. I have also tested different kernel versions but getting same error, so I do not think this is a regression. moblin dmesg : http://userweb.kernel.org/~jaswinder/moblin/dmesg-moblin_2633rc5.txt Thanks, -- Jaswinder Singh. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on 11 May 2010 14:00 On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson <chris(a)chris-wilson.co.uk> wrote: > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote: > > Hello, > > > > With latest git kernel, I am getting following DRM error and not > > getting XWindows : > > [snip] > > Hmm, there are still patches for capturing error state that haven't gone > upstream, shame on me. > > That error is a secondary issue to the GPU hang that is being reported. If > it is a regression caused by a kernel update it would be very useful if > you could bisect to the erroneous commit. It helps if one reads the code and the trace... i915_error_object_create() is using KM_USER0 from softirq context. That's a bug, and a pretty serious one. If some innocent civilian is writing highmem data to disk and this timer interrupt fires and trashes his KM_USER0 slot, the disk contents will be corrupted. Something like this... --- a/drivers/gpu/drm/i915/i915_irq.c~a +++ a/drivers/gpu/drm/i915/i915_irq.c @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi for (page = 0; page < page_count; page++) { void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC); + unsigned long flags; + if (d == NULL) goto unwind; - s = kmap_atomic(src_priv->pages[page], KM_USER0); + local_irq_save(flags); + s = kmap_atomic(src_priv->pages[page], KM_IRQ0); memcpy(d, s, PAGE_SIZE); - kunmap_atomic(s, KM_USER0); + kunmap_atomic(s, KM_IRQ0); + local_irq_restore(flags); dst->pages[page] = d; } dst->page_count = page_count; _ Please let's get a tested fix for this into 2.6.34. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jaswinder Singh Rajput on 11 May 2010 14:20
Hello Andrew, On Tue, May 11, 2010 at 8:18 PM, Andrew Morton <akpm(a)linux-foundation.org> wrote: > On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson <chris(a)chris-wilson.co.uk> wrote: > >> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput <jaswinderlinux(a)gmail.com> wrote: >> > Hello, >> > >> > With latest git kernel, I am getting following DRM error and not >> > getting XWindows : >> >> [snip] >> >> Hmm, there are still patches for capturing error state that haven't gone >> upstream, shame on me. >> >> That error is a secondary issue to the GPU hang that is being reported. If >> it is a regression caused by a kernel update it would be very useful if >> you could bisect to the erroneous commit. > > It helps if one reads the code and the trace... > > i915_error_object_create() is using KM_USER0 from softirq context. > That's a bug, and a pretty serious one. �If some innocent civilian is > writing highmem data to disk and this timer interrupt fires and trashes > his KM_USER0 slot, the disk contents will be corrupted. > > Something like this... > > --- a/drivers/gpu/drm/i915/i915_irq.c~a > +++ a/drivers/gpu/drm/i915/i915_irq.c > @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi > > � � � �for (page = 0; page < page_count; page++) { > � � � � � � � �void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC); > + � � � � � � � unsigned long flags; > + > � � � � � � � �if (d == NULL) > � � � � � � � � � � � �goto unwind; > - � � � � � � � s = kmap_atomic(src_priv->pages[page], KM_USER0); > + � � � � � � � local_irq_save(flags); > + � � � � � � � s = kmap_atomic(src_priv->pages[page], KM_IRQ0); > � � � � � � � �memcpy(d, s, PAGE_SIZE); > - � � � � � � � kunmap_atomic(s, KM_USER0); > + � � � � � � � kunmap_atomic(s, KM_IRQ0); > + � � � � � � � local_irq_restore(flags); > � � � � � � � �dst->pages[page] = d; > � � � �} > � � � �dst->page_count = page_count; > _ > > Please let's get a tested fix for this into 2.6.34. > I tested your patch with latest linus git and it works, it fixes the softirq error. Now I am only getting DRM errors : [ 42.276059] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [ 42.276398] render error detected, EIR: 0x00000000 [ 42.276460] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 18 at 17) Thanks, -- Jaswinder Singh. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |