From: Andrey Rahmatullin on
Hello.
When I run lxc-stop for my Debian container, my system OOPSes. This is not
100% reproducible, though. I could catch only a part of the OOPS report
using netconsole:


[ 9243.740626] Oops: 0000 [#1] PREEMPT
[ 9243.740648] last sysfs file: /sys/devices/platform/w83627hf.656/fan1_input
[ 9243.740655] Modules linked in: veth netconsole configfs cdc_acm nf_conntrack_ftp iptable_mangle radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core vboxnetadp vboxnetflt vboxdrv w83627hf hwmon_vid aes_i586 aes_generic af_packet ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc sit tunnel4 bridge stp llc ipv6 via_rhine mii ipt_LOG xt_limit ipt_REJECT xt_tcpudp xt_state xt_mac iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack iptable_filter ip_tables x_tables nls_cp1251 nls_cp866 vfat fat ext3 jbd mbcache arc4 ecb rt61pci crc_itu_t rt2x00pci snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd rt2x00lib sr_mod cdrom processor fan thermal mac80211 thermal_sys usb_storage usb_libusual serial_core soundcore hwmon hid_a4tech button pata_via cfg80211 uhci_hcd eeprom_93cx6 fuse via_agp agpgart evdev usbhid hid ehci_hcd usbcore nls_base dm_mod [last unloaded: nf_conntrack_ftp]
[ 9243.741009]
[ 9243.741009] Pid: 22287, comm: init Not tainted 2.6.34-wrar-2 #1 KT600-8237/KT600-8237
[ 9243.741009] EIP: 0060:[<c011e8a2>] EFLAGS: 00210086 CPU: 0
[ 9243.741009] EIP is at set_next_entity+0xd/0x117
[ 9243.741009] EAX: eeb050c0 EBX: 00000000 ECX: 00000000 EDX: 00000000
[ 9243.741009] ESI: eeb050c0 EDI: c04734ac EBP: ed625df4 ESP: ed625de4
[ 9243.741009] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 9243.741009] Process init (pid: 22287, ti=ed625000 task=f36818c0 task.ti=ed625000)
[ 9243.741009] Stack:

This is 2.6.34-rc2, though I IIRC I had these crashes on .33 too.

--
WBR, wRAR (ALT Linux Team)
From: Daniel Lezcano on
Andrey Rahmatullin wrote:
> Hello.
> When I run lxc-stop for my Debian container, my system OOPSes. This is not
> 100% reproducible, though. I could catch only a part of the OOPS report
> using netconsole:
>
>
> [ 9243.740626] Oops: 0000 [#1] PREEMPT
> [ 9243.740648] last sysfs file: /sys/devices/platform/w83627hf.656/fan1_input
> [ 9243.740655] Modules linked in: veth netconsole configfs cdc_acm nf_conntrack_ftp iptable_mangle radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core vboxnetadp vboxnetflt vboxdrv w83627hf hwmon_vid aes_i586 aes_generic af_packet ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc sit tunnel4 bridge stp llc ipv6 via_rhine mii ipt_LOG xt_limit ipt_REJECT xt_tcpudp xt_state xt_mac iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack iptable_filter ip_tables x_tables nls_cp1251 nls_cp866 vfat fat ext3 jbd mbcache arc4 ecb rt61pci crc_itu_t rt2x00pci snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd rt2x00lib sr_mod cdrom processor fan thermal mac80211 thermal_sys usb_storage usb_libusual serial_core soundcore hwmon hid_a4tech button pata_via cfg80211 uhci_hcd eeprom_93cx6 fuse via_agp agpgart evdev usbhid hid ehci_hcd usbcore nls_base dm_mod [last unloaded: nf_conntrack_ftp]
> [ 9243.741009]
> [ 9243.741009] Pid: 22287, comm: init Not tainted 2.6.34-wrar-2 #1 KT600-8237/KT600-8237
> [ 9243.741009] EIP: 0060:[<c011e8a2>] EFLAGS: 00210086 CPU: 0
> [ 9243.741009] EIP is at set_next_entity+0xd/0x117
> [ 9243.741009] EAX: eeb050c0 EBX: 00000000 ECX: 00000000 EDX: 00000000
> [ 9243.741009] ESI: eeb050c0 EDI: c04734ac EBP: ed625df4 ESP: ed625de4
> [ 9243.741009] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> [ 9243.741009] Process init (pid: 22287, ti=ed625000 task=f36818c0 task.ti=ed625000)
> [ 9243.741009] Stack:
>
> This is 2.6.34-rc2, though I IIRC I had these crashes on .33 too.
>

At the first glance I would say it is related to the FAIR_SCHEDULER.
After having multiple Oops on my host, I decided to disable this option
in the kernel. It is hard to reproduce, and with a few clues it is
difficult to report the problem to lkml@.

Does sysrq + t show something ? Or the host is definitively stuck ?

Thanks
-- Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrey Rahmatullin on
On Mon, Apr 12, 2010 at 11:34:16AM +0200, Daniel Lezcano wrote:
> At the first glance I would say it is related to the FAIR_SCHEDULER.
CONFIG_FAIR_GROUP_SCHED=y

> After having multiple Oops on my host, I decided to disable this option
> in the kernel.
I will try it.

> Does sysrq + t show something ? Or the host is definitively stuck ?
Well, I didn't try that, because I didn't try to reproduce it without X11
yet, but I'll try.

--
WBR, wRAR (ALT Linux Team)