From: Phillip Pi on 11 Jan 2010 15:33 Hello, The last few weeks, I noticed my old Linux/Debian box (2.6.30-2) keeps getting random and rare high CPU due to Xorg and sometimes crashes. My box, even via SSH2, felt slow. I checked the processes and saw: $ w 11:53:37 up 6 days, 4:19, 3 users, load average: 6.26, 6.04, 6.19 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT ant tty1 Wed03 5days 9.79s 0.00s /bin/bash /usr/bin/start ant pts/3 [deleted IP addy]10:37 0.00s 0.10s 0.00s w ant pts/4 foobar:S.0 05Jan10 10:30 16.00s 16.00s BitchX Ant... $ top top - 11:55:08 up 6 days, 4:20, 3 users, load average: 6.13, 5.91, 6.12 Tasks: 132 total, 3 running, 129 sleeping, 0 stopped, 0 zombie Cpu0 : 6.9%us, 2.4%sy, 1.3%ni, 88.1%id, 0.6%wa, 0.1%hi, 0.6%si, 0.0%st Mem: 2594748k total, 2168336k used, 426412k free, 64348k buffers Swap: 2361512k total, 6452k used, 2355060k free, 1847020k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15529 root 20 0 101m 73m 2992 R 99.7 2.9 225:08.76 Xorg 20840 ant 20 0 2468 1180 892 R 0.2 0.0 0:00.01 top 1 root 20 0 2036 348 324 S 0.0 0.0 0:02.62 init .... I tried to kill startx and Xorg processes, and my box froze (still pingable, remote SSH2 connection frozen but not connectable, and IRC connections lost). I have tried recompling the latest stable NVIDIA (from nvidia.com) driver for GeForce FX 5200 (AGP), redoing my /etc/X11/xorg.conf with NVIDIA's script help, disabling Compiz, etc. I checked logs. In /var/log/X11, I saw a bunch of: (EE) NVIDIA(0): Error recovery failed. (EE) NVIDIA(0): *** Aborting *** (II) NVIDIA(0): Initialized AGP GART. This sounds bad? What does that mean? End of dmesg showed these lines: .... [72619.360521] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun Nov 8 20:26:31 PST 2009 .... [72833.815914] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun Nov 8 20:26:31 PST 2009 [72833.947202] agpgart-amd64 0000:00:00.0: AGP 3.5 bridge [72833.947218] agpgart-amd64 0000:00:00.0: putting AGP V3 device into 8x mode [72833.947284] nvidia 0000:01:00.0: putting AGP V3 device into 8x mode .... [99432.775115] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 00000000 3f800000 [99469.794940] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 00000000 3f800000 [99469.836150] NVRM: Xid (0001:00): 7, Ch 00000002 M 00000a64 D 00000000 intr 00010000 [224756.205022] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 00000000 3f800000 [224756.251066] NVRM: Xid (0001:00): 7, Ch 00000002 M 0000069c D 471229dd intr 00010000 [225085.201829] NVRM: Xid (0001:00): 6, PE0002 0000 40000000 0010a7bc c0000000 3f800000 [225085.246217] NVRM: Xid (0001:00): 7, Ch 00000002 M 00001d7c D ffff0000 intr 00010000 .... [526347.572029] NVRM: Xid (0001:00): 8, Channel 00000000 I posted more complete and other logs at, including sensors -f: http://pastie.org/774029 ... My old Debian machine specifications can be found in http://alpha.zimage.com/~ant/antfarm/about/computers.txt (Secondary/Backup Computer section). Any ideas? I do keep my Debian updated daily with apt-get update and upgrade commands. I do not recall any recent X changes. Thank you in advance. :) -- Phillip Pi Senior Software Quality Assurance Analyst Partner Engineering/Internet Service Provider/Symantec Online Services, Consumer Business Unit Symantec Corporation www.symantec.com ----------------------------------------------------- Email: phillip_pi(a)symantec.comSYMC (remove SYMC to reply by e-mail) ----------------------------------------------------- Please do NOT e-mail me for technical support. DISCLAIMER: The views expressed in this posting are mine, and do not necessarily reflect the views of my employer. Thank you.
From: Phillip Pi on 11 Jan 2010 16:44 http://pastebin.ca/1747442 (whole dmesg) if needed. On 1/11/2010 12:33 PM PT, Phillip Pi typed: > The last few weeks, I noticed my old Linux/Debian box (2.6.30-2) keeps > getting random and rare high CPU due to Xorg and sometimes crashes. My > box, even via SSH2, felt slow. I checked the processes and saw: > > $ w > 11:53:37 up 6 days, 4:19, 3 users, load average: 6.26, 6.04, 6.19 > USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT > ant tty1 Wed03 5days 9.79s 0.00s /bin/bash /usr/bin/start > ant pts/3 [deleted IP addy]10:37 0.00s 0.10s 0.00s w > ant pts/4 foobar:S.0 05Jan10 10:30 16.00s 16.00s BitchX Ant... > > $ top > top - 11:55:08 up 6 days, 4:20, 3 users, load average: 6.13, 5.91, 6.12 > Tasks: 132 total, 3 running, 129 sleeping, 0 stopped, 0 zombie > Cpu0 : 6.9%us, 2.4%sy, 1.3%ni, 88.1%id, 0.6%wa, 0.1%hi, 0.6%si, 0.0%st > Mem: 2594748k total, 2168336k used, 426412k free, 64348k buffers > Swap: 2361512k total, 6452k used, 2355060k free, 1847020k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 15529 root 20 0 101m 73m 2992 R 99.7 2.9 225:08.76 Xorg > 20840 ant 20 0 2468 1180 892 R 0.2 0.0 0:00.01 top > 1 root 20 0 2036 348 324 S 0.0 0.0 0:02.62 init > ... > > I tried to kill startx and Xorg processes, and my box froze (still > pingable, remote SSH2 connection frozen but not connectable, and IRC > connections lost). I have tried recompling the latest stable NVIDIA > (from nvidia.com) driver for GeForce FX 5200 (AGP), redoing my > /etc/X11/xorg.conf with NVIDIA's script help, disabling Compiz, etc. > > I checked logs. In /var/log/X11, I saw a bunch of: > (EE) NVIDIA(0): Error recovery failed. > (EE) NVIDIA(0): *** Aborting *** > (II) NVIDIA(0): Initialized AGP GART. > > This sounds bad? What does that mean? End of dmesg showed these lines: > ... > [72619.360521] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun > Nov 8 20:26:31 PST 2009 > ... > [72833.815914] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun > Nov 8 20:26:31 PST 2009 > [72833.947202] agpgart-amd64 0000:00:00.0: AGP 3.5 bridge > [72833.947218] agpgart-amd64 0000:00:00.0: putting AGP V3 device into 8x > mode > [72833.947284] nvidia 0000:01:00.0: putting AGP V3 device into 8x mode > ... > [99432.775115] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 > 00000000 3f800000 > [99469.794940] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 > 00000000 3f800000 > [99469.836150] NVRM: Xid (0001:00): 7, Ch 00000002 M 00000a64 D 00000000 > intr 00010000 > [224756.205022] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 > 00000000 3f800000 > [224756.251066] NVRM: Xid (0001:00): 7, Ch 00000002 M 0000069c D > 471229dd intr 00010000 > [225085.201829] NVRM: Xid (0001:00): 6, PE0002 0000 40000000 0010a7bc > c0000000 3f800000 > [225085.246217] NVRM: Xid (0001:00): 7, Ch 00000002 M 00001d7c D > ffff0000 intr 00010000 > ... > [526347.572029] NVRM: Xid (0001:00): 8, Channel 00000000 > > I posted more complete and other logs at, including sensors -f: > http://pastie.org/774029 ... My old Debian machine specifications can be > found in http://alpha.zimage.com/~ant/antfarm/about/computers.txt > (Secondary/Backup Computer section). > > Any ideas? I do keep my Debian updated daily with apt-get update and > upgrade commands. I do not recall any recent X changes. -- Phillip Pi Senior Software Quality Assurance Analyst Partner Engineering/Internet Service Provider/Symantec Online Services, Consumer Business Unit Symantec Corporation www.symantec.com ----------------------------------------------------- Email: phillip_pi(a)symantec.comSYMC (remove SYMC to reply by e-mail) ----------------------------------------------------- Please do NOT e-mail me for technical support. DISCLAIMER: The views expressed in this posting are mine, and do not necessarily reflect the views of my employer. Thank you.
From: thunder8 on 12 Jan 2010 04:52 From: Phillip Pi <phillip_pi(a)symantec.comSYMC> Date: Mon, 11 Jan 2010 13:44:25 -0800 > http://pastebin.ca/1747442 (whole dmesg) if needed. > > On 1/11/2010 12:33 PM PT, Phillip Pi typed: > >> The last few weeks, I noticed my old Linux/Debian box (2.6.30-2) keeps >> getting random and rare high CPU due to Xorg and sometimes crashes. My >> box, even via SSH2, felt slow. I checked the processes and saw: >> >> I checked logs. In /var/log/X11, I saw a bunch of: >> (EE) NVIDIA(0): Error recovery failed. >> (EE) NVIDIA(0): *** Aborting *** >> (II) NVIDIA(0): Initialized AGP GART. >> >> This sounds bad? What does that mean? End of dmesg showed these lines: >> ... The problem is, nobody knows what's going on inside the binary nvidia module except Nvidia. That's why it's called 'closed source'. So the best option is to either use the opensource driver, or take your problems to Nvidia. I realize this may sound harsh, but that is one of the problems of closed source, after all. Kind regards, Jurriaan -- prachtige geschenken, exclusieve cadeaus: handgemaakte houten schalen http://www.houtenschalen.nl
From: Phillip Pi on 12 Jan 2010 22:41 On 1/12/2010 1:52 AM PT, thunder8 typed: >>> The last few weeks, I noticed my old Linux/Debian box (2.6.30-2) keeps >>> getting random and rare high CPU due to Xorg and sometimes crashes. My >>> box, even via SSH2, felt slow. I checked the processes and saw: >>> >>> I checked logs. In /var/log/X11, I saw a bunch of: >>> (EE) NVIDIA(0): Error recovery failed. >>> (EE) NVIDIA(0): *** Aborting *** >>> (II) NVIDIA(0): Initialized AGP GART. >>> >>> This sounds bad? What does that mean? End of dmesg showed these lines: >>> ... >> http://pastebin.ca/1747442 (whole dmesg) if needed. > The problem is, nobody knows what's going on inside the binary nvidia > module except Nvidia. That's why it's called 'closed source'. So the > best option is to either use the opensource driver, or take your > problems to Nvidia. > > I realize this may sound harsh, but that is one of the problems of > closed source, after all. Even NVIDIA folks don't seem to know so far due to lack of replies: http://www.nvnews.net/vbulletin/showthread.php?p=2162526 ... :( -- Phillip Pi Senior Software Quality Assurance Analyst Partner Engineering/Internet Service Provider/Symantec Online Services, Consumer Business Unit Symantec Corporation www.symantec.com ----------------------------------------------------- Email: phillip_pi(a)symantec.comSYMC (remove SYMC to reply by e-mail) ----------------------------------------------------- Please do NOT e-mail me for technical support. DISCLAIMER: The views expressed in this posting are mine, and do not necessarily reflect the views of my employer. Thank you.
From: Ant on 18 Jan 2010 03:03 I got another one again about 30 minutes ago while using it. I tried disabling AMD's Cool'n'Quiet in CMOS and powernow in Debian/Linux. They did not fix it. I noticed a pattern that I didn't mentioned before. If I am using the computer and the issue comes up, it shows a screen blink and then the CPU goes up and X stops responding. Some logs bits: dmesg: .... [526179.772020] NVRM: Xid (0001:00): 8, Channel 00000000 [526187.923984] Clocksource tsc unstable (delta = 4686847433 ns) [526195.932026] NVRM: Xid (0001:00): 8, Channel 00000000 [526207.944026] NVRM: Xid (0001:00): 8, Channel 00000000 [526219.956030] NVRM: Xid (0001:00): 8, Channel 00000000 [526231.972025] NVRM: Xid (0001:00): 8, Channel 00000000 [526243.984030] NVRM: Xid (0001:00): 8, Channel 00000000 [526255.996025] NVRM: Xid (0001:00): 8, Channel 00000000 [526268.008028] NVRM: Xid (0001:00): 8, Channel 00000000 $ sensors -f k8temp-pci-00c3 Adapter: PCI adapter Core0 Temp: +134.6�F GKrellM showed frozen state with: Vcor1 = 1.50 +3.3V = 3.33 +12V = 11.3 -12V = 2.11 -5V = 5.10 V5SB = 5.54 VBat = 3.17 I was able to use Terminal very slowly via an existing SSH2 connection: $ sensors -f w83697hf-isa-0290 Adapter: ISA adapter in0: +1.50 V (min = +0.19 V, max = +0.13 V) ALARM in2: +3.33 V (min = +0.43 V, max = +0.02 V) ALARM in3: +3.01 V (min = +0.02 V, max = +0.13 V) ALARM in4: +2.96 V (min = +0.06 V, max = +2.86 V) ALARM in5: +3.30 V (min = +0.06 V, max = +2.24 V) ALARM in6: +4.08 V (min = +2.56 V, max = +0.00 V) ALARM in7: +3.30 V (min = +0.08 V, max = +0.03 V) ALARM in8: +3.17 V (min = +0.00 V, max = +1.28 V) ALARM fan1: 0 RPM (min = 73 RPM, div = 128) ALARM fan2: 2410 RPM (min = 2109 RPM, div = 4) temp1: +91.4�F (high = +172.4°F, hyst = +105.8°F) sensor = thermistor temp2: +128.3�F (high = +176.0°F, hyst = +167.0°F) sensor = thermistor beep_enable:enabled Do those power flows look correct? It is with a new Antec Basiq BP550 Plus 550W Continuous Power ATX12V V2.2 Modular Active PFC power supply too unless it is defected? Or maybe the GeForce FX is bad now? $ top top - 22:44:31 up 6 days, 2:14, 3 users, load average: 7.72, 4.93, 2.32 Tasks: 142 total, 3 running, 139 sleeping, 0 stopped, 0 zombie Cpu0 : 4.8%us, 1.4%sy, 0.7%ni, 92.2%id, 0.8%wa, 0.1%hi, 0.1%si, 0.0%st Mem: 2594748k total, 1474872k used, 1119876k free, 227992k buffers Swap: 2361512k total, 5820k used, 2355692k free, 761668k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 565 root 20 0 101m 66m 5420 R 99.9 2.6 59:05.45 Xorg 14954 ant 20 0 2468 1180 892 R 0.3 0.0 0:00.02 top 1 root 20 0 2036 368 316 S 0.0 0.0 0:02.06 init .... I looked at my ~/.xsession-errors file to the end: Xsession: X session started for ant at Wed Jan 13 06:24:06 PST 2010 startkde: Starting up... kbuildsycoca running... /tmp/kde-ant/kcminitizrDCa.tmp:1:2: error: invalid preprocessing directive #http Gtk-Message: Failed to load module "canberra-gtk-module": libcanberra-gtk-module.so: cannot open shared object file: No such file or directory .... (seamonkey-bin:26879): Gdk-WARNING **: XID collision, trouble ahead .... X Error: BadWindow (invalid Window parameter) 3 Major opcode: 19 Minor opcode: 0 Resource id: 0x1a6a36e X Error: BadWindow (invalid Window parameter) 3 Major opcode: 19 Minor opcode: 0 Resource id: 0x3400008 X Error: BadWindow (invalid Window parameter) 3 Major opcode: 19 Minor opcode: 0 Resource id: 0x3017201 X Error: BadWindow (invalid Window parameter) 3 Major opcode: 19 Minor opcode: 0 Resource id: 0x3000024 kwin: X_SetInputFocus(0x282d216): BadMatch (invalid parameter attributes) X Error: BadWindow (invalid Window parameter) 3 Major opcode: 19 Minor opcode: 0 Resource id: 0x3200008 I saw a bunch of "(seamonkey-bin:#): Gdk-WARNING **: XID collision, trouble ahead" lines. I did a quick search in Google and saw Firefox users having them too, so I assume this is unrelated to my crashes? http://pastie.org/782807 for /var/log/X11/Xorg.0.log since the forum said my reply was too long. :P I just tried another idea was to uninstall NVIDIA drivers with /usr/bin/nvidia-uninstall (never did that in the past), recompiled, reinstall, and restart X. I wonder if that will fix my issue. On 1/11/2010 1:44 PM PT, Phillip Pi typed: > http://pastebin.ca/1747442 (whole dmesg) if needed. > > On 1/11/2010 12:33 PM PT, Phillip Pi typed: > >> The last few weeks, I noticed my old Linux/Debian box (2.6.30-2) keeps >> getting random and rare high CPU due to Xorg and sometimes crashes. My >> box, even via SSH2, felt slow. I checked the processes and saw: >> >> $ w >> 11:53:37 up 6 days, 4:19, 3 users, load average: 6.26, 6.04, 6.19 >> USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT >> ant tty1 Wed03 5days 9.79s 0.00s /bin/bash /usr/bin/start >> ant pts/3 [deleted IP addy]10:37 0.00s 0.10s 0.00s w >> ant pts/4 foobar:S.0 05Jan10 10:30 16.00s 16.00s BitchX Ant... >> >> $ top >> top - 11:55:08 up 6 days, 4:20, 3 users, load average: 6.13, 5.91, 6.12 >> Tasks: 132 total, 3 running, 129 sleeping, 0 stopped, 0 zombie >> Cpu0 : 6.9%us, 2.4%sy, 1.3%ni, 88.1%id, 0.6%wa, 0.1%hi, 0.6%si, 0.0%st >> Mem: 2594748k total, 2168336k used, 426412k free, 64348k buffers >> Swap: 2361512k total, 6452k used, 2355060k free, 1847020k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 15529 root 20 0 101m 73m 2992 R 99.7 2.9 225:08.76 Xorg >> 20840 ant 20 0 2468 1180 892 R 0.2 0.0 0:00.01 top >> 1 root 20 0 2036 348 324 S 0.0 0.0 0:02.62 init >> ... >> >> I tried to kill startx and Xorg processes, and my box froze (still >> pingable, remote SSH2 connection frozen but not connectable, and IRC >> connections lost). I have tried recompling the latest stable NVIDIA >> (from nvidia.com) driver for GeForce FX 5200 (AGP), redoing my >> /etc/X11/xorg.conf with NVIDIA's script help, disabling Compiz, etc. >> >> I checked logs. In /var/log/X11, I saw a bunch of: >> (EE) NVIDIA(0): Error recovery failed. >> (EE) NVIDIA(0): *** Aborting *** >> (II) NVIDIA(0): Initialized AGP GART. >> >> This sounds bad? What does that mean? End of dmesg showed these lines: >> ... >> [72619.360521] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun >> Nov 8 20:26:31 PST 2009 >> ... >> [72833.815914] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.22 Sun >> Nov 8 20:26:31 PST 2009 >> [72833.947202] agpgart-amd64 0000:00:00.0: AGP 3.5 bridge >> [72833.947218] agpgart-amd64 0000:00:00.0: putting AGP V3 device into 8x >> mode >> [72833.947284] nvidia 0000:01:00.0: putting AGP V3 device into 8x mode >> ... >> [99432.775115] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 >> 00000000 3f800000 >> [99469.794940] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 >> 00000000 3f800000 >> [99469.836150] NVRM: Xid (0001:00): 7, Ch 00000002 M 00000a64 D 00000000 >> intr 00010000 >> [224756.205022] NVRM: Xid (0001:00): 6, PE0002 06bc 3f800000 0008fd14 >> 00000000 3f800000 >> [224756.251066] NVRM: Xid (0001:00): 7, Ch 00000002 M 0000069c D >> 471229dd intr 00010000 >> [225085.201829] NVRM: Xid (0001:00): 6, PE0002 0000 40000000 0010a7bc >> c0000000 3f800000 >> [225085.246217] NVRM: Xid (0001:00): 7, Ch 00000002 M 00001d7c D >> ffff0000 intr 00010000 >> ... >> [526347.572029] NVRM: Xid (0001:00): 8, Channel 00000000 >> >> I posted more complete and other logs at, including sensors -f: >> http://pastie.org/774029 ... My old Debian machine specifications can be >> found in http://alpha.zimage.com/~ant/antfarm/about/computers.txt >> (Secondary/Backup Computer section). >> >> Any ideas? I do keep my Debian updated daily with apt-get update and >> upgrade commands. I do not recall any recent X changes. -- "All the best work is done the way that ants do things -- by tiny but untiring and regular additions." --Lafcadio Hearn /\___/\ / /\ /\ \ Phil./Ant @ http://antfarm.ma.cx (Personal Web Site) | |o o| | Ant's Quality Foraged Links: http://aqfl.net \ _ / Nuke ANT from e-mail address: philpi(a)earthlink.netANT ( ) or ANTant(a)zimage.com
|
Next
|
Last
Pages: 1 2 3 Prev: 500 GB Maxtor OneTouch 4 misfunctioning . . . Next: Linux friendly Laptops and Netbooks?? |