Prev: Bug in current -git tree causing dbus and gnome to chew up cpu time
Next: Introduce O_CLOEXEC (take >2)
From: Linus Torvalds on 11 May 2007 13:00 On Thu, 10 May 2007, Pavel Machek wrote: > > Also notice that current cpus were not designed to work 300 years. > When we have hw designed for 50 years+, we can start to worry. Indeed. CPU manufacturers don't seem to talk about it very much, and searching for it with google on intel.com comes up with The failure rate and Mean Time Between Failure (MTBF) data is not currently available on our website. You may contact Intel� Customer Support for this information. which seems to be just a fancy way of saying "we don't actually want to talk about it". Probably not because it's actually all that bad, but simply because people don't think about it, and there's no reason a CPU manufacturer would *want* people to think about it. But if you wondered why server CPU's usually run at a lower frequency, it's because of MTBF issues. I think a desktop CPU is usually specced to run for 5 years (and that's expecting that it's turned off or at least idle much of the time), while a server CPU is expected to last longer and be active a much bigger percentage of time. ("Active" == "heat" == "more damage due to atom migration etc". Which is part of why you're not supposed to overclock stuff: it may well work well for you, but for all you know it will cut your expected CPU life by 90%). Of course, all the other components have a MTBF too (and I'd generally expect a power supply component to fail long before the CPU does). And yes, some machines will last for decades. But I suspect we've all heard of machines that just don't work reliably any more. Linus
From: Pavel Machek on 11 May 2007 15:30 Hi! > > Also notice that current cpus were not designed to work 300 years. > > When we have hw designed for 50 years+, we can start to worry. > > Indeed. CPU manufacturers don't seem to talk about it very much, and > searching for it with google on intel.com comes up with > > The failure rate and Mean Time Between Failure (MTBF) data is not > currently available on our website. You may contact Intel? > Customer Support for this information. > > which seems to be just a fancy way of saying "we don't actually want to > talk about it". Probably not because it's actually all that bad, but > simply because people don't think about it, and there's no reason a CPU > manufacturer would *want* people to think about it. > > But if you wondered why server CPU's usually run at a lower frequency, > it's because of MTBF issues. I think a desktop CPU is usually specced to > run for 5 years (and that's expecting that it's turned off or at least > idle much of the time), while a server CPU is expected to last longer and > be active a much bigger percentage of time. > > ("Active" == "heat" == "more damage due to atom migration etc". Which is > part of why you're not supposed to overclock stuff: it may well work well > for you, but for all you know it will cut your expected CPU life by 90%). Actually, when I talked with AMD, they told me that cpus should last 10 years *at their max specced temperature*... which is 95Celsius. So overclocking is not that evil, according to my info. (That would mean way more than 10 years if you use your cpu 'normally'.) But I guess capacitors from cpu power supply will hate you running cpu at 95C... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Willy Tarreau on 11 May 2007 15:40 On Fri, May 11, 2007 at 07:18:25PM +0000, Pavel Machek wrote: > Hi! > > > > Also notice that current cpus were not designed to work 300 years. > > > When we have hw designed for 50 years+, we can start to worry. > > > > Indeed. CPU manufacturers don't seem to talk about it very much, and > > searching for it with google on intel.com comes up with > > > > The failure rate and Mean Time Between Failure (MTBF) data is not > > currently available on our website. You may contact Intel? > > Customer Support for this information. > > > > which seems to be just a fancy way of saying "we don't actually want to > > talk about it". Probably not because it's actually all that bad, but > > simply because people don't think about it, and there's no reason a CPU > > manufacturer would *want* people to think about it. > > > > But if you wondered why server CPU's usually run at a lower frequency, > > it's because of MTBF issues. I think a desktop CPU is usually specced to > > run for 5 years (and that's expecting that it's turned off or at least > > idle much of the time), while a server CPU is expected to last longer and > > be active a much bigger percentage of time. > > > > ("Active" == "heat" == "more damage due to atom migration etc". Which is > > part of why you're not supposed to overclock stuff: it may well work well > > for you, but for all you know it will cut your expected CPU life by 90%). > > Actually, when I talked with AMD, they told me that cpus should last > 10 years *at their max specced temperature*... which is 95Celsius. So > overclocking is not that evil, according to my info. I tend to believe that. I've slowed down the FANs on my dual-athlon XP to silent them, and I found the system to be (apparently) stable till 96 deg Celsius with FANs unplugged. So I regulated them in order to maintain a temperature below 90 deg for a safety margin, and they seem to be as happy as my ears. They've been like that for the last 3-4 years, I don't remember. On a related note, older technology is less sensible. My old VAX VLC4000 from 1991 which received a small heatsink in exchange for its noisy fans is still doing well in an closed place. I'm more worried for the EPROMs which store the boot code. Also, I think that the RS6000 processed at half-micron which run the Mars rovers might run for hundreds of years at 30 MHz. > (That would mean way more than 10 years if you use your cpu > 'normally'.) > > But I guess capacitors from cpu power supply will hate you running cpu > at 95C... even at 60, many of them die within a few years. Bu I think we're "slightly" off-topic now... > Pavel Cheers Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Bowling on 11 May 2007 17:00
On 5/11/07, Willy Tarreau <w(a)1wt.eu> wrote: > On Fri, May 11, 2007 at 07:18:25PM +0000, Pavel Machek wrote: > > Hi! > > > > > > Also notice that current cpus were not designed to work 300 years. > > > > When we have hw designed for 50 years+, we can start to worry. > > On a related note, older technology is less sensible. My old VAX VLC4000 > from 1991 which received a small heatsink in exchange for its noisy fans > is still doing well in an closed place. I'm more worried for the EPROMs > which store the boot code. Also, I think that the RS6000 processed at > half-micron which run the Mars rovers might run for hundreds of years > at 30 MHz. Not your typical RS/6000. http://en.wikipedia.org/wiki/RAD6000 Without a doubt, decently built computers will easily last at least 10 years. From what I can tell, the issue here was simply OS runtime? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |