From: Ant on 25 Apr 2010 02:11 > The crashes seem to happen during idled time. I do > not use AMD's Cool'n' Quiet and PowerNow-K8. FYI. For the first time, I got a kernel panic when I was my computer. Mostly, surfing the Web in Mozilla's SeaMonkey v2.0.4. So, it is not tied to idled times then.
From: Ant on 25 Apr 2010 08:37 On 4/24/2010 11:11 PM PT, Ant typed: >> The crashes seem to happen during idled time. I do >> not use AMD's Cool'n' Quiet and PowerNow-K8. > > FYI. For the first time, I got a kernel panic when I was my computer. > Mostly, surfing the Web in Mozilla's SeaMonkey v2.0.4. So, it is not > tied to idled times then. And another. Grr. -- "Have I told you how much I like ants, huh? Especially fried in a subtle blend of mech fluid and grated gears?" --Rampage to Inferno, "Transmutate" in Transformers (Beast Wars) /\___/\ Phil./Ant @ http://antfarm.ma.cx (Personal Web Site) / /\ /\ \ Ant's Quality Foraged Links: http://aqfl.net | |o o| | \ _ / If crediting, then use Ant nickname and AQFL URL/link. ( ) If e-mailing, then axe ANT from its address if needed. Ant is currently not listening to any songs on this computer.
From: Yousuf Khan on 26 Apr 2010 11:10 Ant wrote: > On 4/24/2010 11:11 PM PT, Ant typed: > >>> The crashes seem to happen during idled time. I do >>> not use AMD's Cool'n' Quiet and PowerNow-K8. >> >> FYI. For the first time, I got a kernel panic when I was my computer. >> Mostly, surfing the Web in Mozilla's SeaMonkey v2.0.4. So, it is not >> tied to idled times then. > > And another. Grr. It's probably getting worse. Might be time to think about replacement. Yousuf Khan
From: Ant on 27 Apr 2010 09:09 On 4/26/2010 8:10 AM PT, Yousuf Khan typed: > It's probably getting worse. Might be time to think about replacement. Yeah, probably at the end of this year when I update my newer Windows' box's hardwares. If it fails completely before it, then I will just go back to my single core Athlon 64 system left overs I have here. BTW, I still cannot reproduce these machine errors and kernel panics outside of my 2005's Debian installation with an Ubuntu liveCD after 15 hours of some usage and idled. I wonder if my old Debian installation is causing it instead of hardwares, but then that doesn't explain why it was stable before the PSU, videocard, RAM, and other failures a few months ago. I probably need to do a full clean reinstall and reconfigure from scratch which I don't have time these days. I will probably save that when I swap/upgrade my hardwares later on. I just want to reproduce this outside of my old Debian installation! Grr!! :( Here's more interesting. After I booted back to old Debian and over 1.5 days of uptime, I haven't gotten any new kernel panics since 4/25/2010 4(don't remember the exact minute) AM PDT and machine errors (4/21/2010 3:26 AM PDT) so far... Again, the issues are not easily reproducable. They come and go! No specific patterns now (not related to temperatures, idle times, etc.). :( -- "Yeah, what's left of it. I was in the militia -- national guard... That's good! Wasn't any war any more than there's war between men and ants." --stranger; "And we're eat-able ants. I found that out... What will they do with us?" --Pierson from H.G. Wells' The War of the Worlds /\___/\ Phil./Ant @ http://antfarm.ma.cx (Personal Web Site) / /\ /\ \ Ant's Quality Foraged Links: http://aqfl.net | |o o| | \ _ / If crediting, then use Ant nickname and AQFL URL/link. ( ) If e-mailing, then axe ANT from its address if needed. Ant is currently not listening to any songs on this computer.
From: Ant on 30 Apr 2010 17:39 On Mar 16, 1:02 pm, ANT...(a)zimage.com wrote: > >> Having a better look through your logs, I see this addr is > >> very common (almost all errs are at this addr). Aren't > >> you curious about the instruction that produced the errors? > >> /boot/System.map should contain the addr of all kernel fns, > >> and there should be some way to lookup modules. > > > I did a "cat /var/log/messages |grep ADDR" and found these addresses: > > c104e3f0 > > c106e8c0 > > c11b6ff0 (most common) > > > But none of them matched to /boot/System.map-2.6.32-trunk-686. Here are > > close addresses around them for each one: > > > c104e2f9 T tick_handle_periodic > > c104e360 T tick_get_broadcast_device > > > c1063e1b t stop_cpu > > c1063ec6 T stop_machine_destroy > > > c11b6fb8 T acpi_pm_read_verified > > c11b6ffc t acpi_pm_read > > Since I did a Kernel upgrade (2.6.32-3 from -2 trunk) yesterday morning, > I noticed a new address in my /var/log/messages (only one so far): > Mar 16 05:41:16 foobar mcelog: HARDWARE ERROR. This is *NOT* a software problem! > Mar 16 05:41:16 foobar mcelog: Please contact your hardware vendor > Mar 16 05:41:16 foobar mcelog: MCE 0 > Mar 16 05:41:16 foobar mcelog: CPU 1 1 instruction cache > Mar 16 05:41:16 foobar mcelog: ADDR c104e570 > Mar 16 05:41:16 foobar mcelog: TIME 1268743276 Tue Mar 16 05:41:16 2010 > Mar 16 05:41:16 foobar mcelog: TLB parity error in virtual array > Mar 16 05:41:16 foobar mcelog: TLB error 'instruction transaction, level 1' > Mar 16 05:41:16 foobar mcelog: STATUS 9400000000010011 MCGSTATUS 0 > Mar 16 05:41:16 foobar mcelog: MCGCAP 105 APICID 1 SOCKETID 0 > Mar 16 05:41:16 foobar mcelog: CPUID Vendor AMD Family 15 Model 43 > > # ls -all /boot/System.map-2.6.32-3-686 > -rw-r--r-- 1 root root 1259340 2010-02-25 01:00 /boot/System.map-2.6.32-3-686 > > I am going to assume contents changed in both Kernel and the system.map. I did a look up to match that c104e570 address. Closest address were: > # cat /boot/System.map-2.6.32-3-686 |grep c104e > c104e07d t tick_notify > c104e374 t tick_periodic > c104e3dd T tick_handle_periodic > c104e444 T tick_get_broadcast_device > c104e44a T tick_get_broadcast_mask > c104e450 T tick_is_broadcast_device > c104e464 T tick_set_periodic_handler > c104e477 T tick_get_broadcast_oneshot_mask > c104e47d T tick_broadcast_oneshot_active > c104e48a T tick_shutdown_broadcast_oneshot > c104e4ac T tick_check_oneshot_broadcast > c104e4d5 T tick_resume_broadcast_oneshot > c104e4e2 T tick_broadcast_setup_oneshot > c104e5ae T tick_broadcast_switch_to_oneshot > c104e5e0 t tick_do_broadcast > c104e634 t tick_handle_oneshot_broadcast > c104e71d t tick_do_periodic_broadcast > c104e74a T tick_broadcast_oneshot_control > c104e82c T tick_resume_broadcast > c104e8a3 T tick_device_uses_broadcast > c104e91b T tick_suspend_broadcast > c104e943 T tick_shutdown_broadcast > c104e989 t tick_handle_periodic_broadcast > c104e9ce T tick_broadcast_on_off > c104eb0e T tick_check_broadcast_device > c104eb60 T tick_oneshot_mode_active > c104eb96 T tick_switch_to_oneshot > c104ec1e T tick_init_highres > c104ec28 T tick_dev_program_event > c104eca9 T tick_setup_oneshot > c104ecd9 T tick_program_event > c104ecfc T tick_resume_oneshot > c104ed24 T tick_get_tick_sched > c104ed33 T tick_nohz_get_sleep_length > c104ed4c T tick_oneshot_notify > c104ed63 t tick_init_jiffy_update > c104edae T tick_check_oneshot_change > c104eea1 t tick_do_update_jiffies64 > c104ef87 t tick_nohz_handler After 1.5 months later, I did comparisons with the last two weeks' logs with two different kernel 2.6.32 i686 (-3 and -4) packages. -3: Apr 20 04:13:52 mcelog: ADDR c104e500 Apr 14 01:36:16 mcelog: ADDR c104e530 Apr 16 06:03:52 mcelog: ADDR c104e540 Apr 20 02:51:22 mcelog: ADDR c104e570 /boot/System.map-2.6.32-3-686 showed: c104e4e2 T tick_broadcast_setup_oneshot c104e5ae T tick_broadcast_switch_to_oneshot Apr 13 23:58:46 mcelog: ADDR c104f2c0 /boot/System.map-2.6.32-4-686 showed: c104f2bb T tick_check_idle c104f32f T tick_nohz_restart_sched_tick Most /var/log/messages' addresses were at c104e570 for Kernel 2.6.32-3. -4 has four days and 21 hours of uptime after upgrading the kernel and rebooting. So far, only two machine errors and no kernel panics: Apr 27 09:00:20 mcelog: ADDR c1046d30 /boot/System.map-2.6.32-4-686 showed: c1046cae T hrtimer_interrupt c1046e08 t __hrtimer_peek_ahead_timers Apr 30 07:17:50 mcelog: ADDR c106ee80 /boot/System.map-2.6.32-4-686 showed: c106ee3c T rcu_irq_enter c106ee88 T rcu_nmi_exit Completely different now. Weird.
|
Pages: 1 Prev: freelance writer Next: is there a one-line summary of all Intel CPUs? |