From: Linus Torvalds on


On Thu, 10 May 2007, Pavel Machek wrote:
>
> Also notice that current cpus were not designed to work 300 years.
> When we have hw designed for 50 years+, we can start to worry.

Indeed. CPU manufacturers don't seem to talk about it very much, and
searching for it with google on intel.com comes up with

The failure rate and Mean Time Between Failure (MTBF) data is not
currently available on our website. You may contact Intel�
Customer Support for this information.

which seems to be just a fancy way of saying "we don't actually want to
talk about it". Probably not because it's actually all that bad, but
simply because people don't think about it, and there's no reason a CPU
manufacturer would *want* people to think about it.

But if you wondered why server CPU's usually run at a lower frequency,
it's because of MTBF issues. I think a desktop CPU is usually specced to
run for 5 years (and that's expecting that it's turned off or at least
idle much of the time), while a server CPU is expected to last longer and
be active a much bigger percentage of time.

("Active" == "heat" == "more damage due to atom migration etc". Which is
part of why you're not supposed to overclock stuff: it may well work well
for you, but for all you know it will cut your expected CPU life by 90%).

Of course, all the other components have a MTBF too (and I'd generally
expect a power supply component to fail long before the CPU does). And
yes, some machines will last for decades. But I suspect we've all heard of
machines that just don't work reliably any more.

Linus
From: Pavel Machek on
Hi!

> > Also notice that current cpus were not designed to work 300 years.
> > When we have hw designed for 50 years+, we can start to worry.
>
> Indeed. CPU manufacturers don't seem to talk about it very much, and
> searching for it with google on intel.com comes up with
>
> The failure rate and Mean Time Between Failure (MTBF) data is not
> currently available on our website. You may contact Intel?
> Customer Support for this information.
>
> which seems to be just a fancy way of saying "we don't actually want to
> talk about it". Probably not because it's actually all that bad, but
> simply because people don't think about it, and there's no reason a CPU
> manufacturer would *want* people to think about it.
>
> But if you wondered why server CPU's usually run at a lower frequency,
> it's because of MTBF issues. I think a desktop CPU is usually specced to
> run for 5 years (and that's expecting that it's turned off or at least
> idle much of the time), while a server CPU is expected to last longer and
> be active a much bigger percentage of time.
>
> ("Active" == "heat" == "more damage due to atom migration etc". Which is
> part of why you're not supposed to overclock stuff: it may well work well
> for you, but for all you know it will cut your expected CPU life by 90%).

Actually, when I talked with AMD, they told me that cpus should last
10 years *at their max specced temperature*... which is 95Celsius. So
overclocking is not that evil, according to my info.

(That would mean way more than 10 years if you use your cpu
'normally'.)

But I guess capacitors from cpu power supply will hate you running cpu
at 95C...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Willy Tarreau on
On Fri, May 11, 2007 at 07:18:25PM +0000, Pavel Machek wrote:
> Hi!
>
> > > Also notice that current cpus were not designed to work 300 years.
> > > When we have hw designed for 50 years+, we can start to worry.
> >
> > Indeed. CPU manufacturers don't seem to talk about it very much, and
> > searching for it with google on intel.com comes up with
> >
> > The failure rate and Mean Time Between Failure (MTBF) data is not
> > currently available on our website. You may contact Intel?
> > Customer Support for this information.
> >
> > which seems to be just a fancy way of saying "we don't actually want to
> > talk about it". Probably not because it's actually all that bad, but
> > simply because people don't think about it, and there's no reason a CPU
> > manufacturer would *want* people to think about it.
> >
> > But if you wondered why server CPU's usually run at a lower frequency,
> > it's because of MTBF issues. I think a desktop CPU is usually specced to
> > run for 5 years (and that's expecting that it's turned off or at least
> > idle much of the time), while a server CPU is expected to last longer and
> > be active a much bigger percentage of time.
> >
> > ("Active" == "heat" == "more damage due to atom migration etc". Which is
> > part of why you're not supposed to overclock stuff: it may well work well
> > for you, but for all you know it will cut your expected CPU life by 90%).
>
> Actually, when I talked with AMD, they told me that cpus should last
> 10 years *at their max specced temperature*... which is 95Celsius. So
> overclocking is not that evil, according to my info.

I tend to believe that. I've slowed down the FANs on my dual-athlon XP
to silent them, and I found the system to be (apparently) stable till
96 deg Celsius with FANs unplugged. So I regulated them in order to maintain
a temperature below 90 deg for a safety margin, and they seem to be as
happy as my ears. They've been like that for the last 3-4 years, I don't
remember.

On a related note, older technology is less sensible. My old VAX VLC4000
from 1991 which received a small heatsink in exchange for its noisy fans
is still doing well in an closed place. I'm more worried for the EPROMs
which store the boot code. Also, I think that the RS6000 processed at
half-micron which run the Mars rovers might run for hundreds of years
at 30 MHz.

> (That would mean way more than 10 years if you use your cpu
> 'normally'.)
>
> But I guess capacitors from cpu power supply will hate you running cpu
> at 95C...

even at 60, many of them die within a few years. Bu I think we're
"slightly" off-topic now...

> Pavel

Cheers
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Bowling on
On 5/11/07, Willy Tarreau <w(a)1wt.eu> wrote:
> On Fri, May 11, 2007 at 07:18:25PM +0000, Pavel Machek wrote:
> > Hi!
> >
> > > > Also notice that current cpus were not designed to work 300 years.
> > > > When we have hw designed for 50 years+, we can start to worry.
>
> On a related note, older technology is less sensible. My old VAX VLC4000
> from 1991 which received a small heatsink in exchange for its noisy fans
> is still doing well in an closed place. I'm more worried for the EPROMs
> which store the boot code. Also, I think that the RS6000 processed at
> half-micron which run the Mars rovers might run for hundreds of years
> at 30 MHz.

Not your typical RS/6000. http://en.wikipedia.org/wiki/RAD6000

Without a doubt, decently built computers will easily last at least 10
years. From what I can tell, the issue here was simply OS runtime?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/