From: Cydrome Leader on
Andrew Gabriel <andrew(a)cucumber.demon.co.uk> wrote:
> In article <hv1200$hda$5(a)reader1.panix.com>,
> Cydrome Leader <presence(a)MUNGEpanix.com> writes:
>> Andrew Gabriel <andrew(a)cucumber.demon.co.uk> wrote:
>>> In article <huru7f$ing$1(a)reader1.panix.com>,
>>> Cydrome Leader <presence(a)MUNGEpanix.com> writes:
>>>>
>>>> I've seen xeon processors (really cores) fail in solaris before and in
>>>> real life there's nothing wrong at all with the CPU. For intel hardware
>>>> just rebooting seems to be the fix. I suspect it's some sort of software
>>>> issue.
>>>
>>> Blimy. We go to extra ordinary effort to retrieve and decode all the Intel
>>> chip telemetry (which Intel tell me no other OS has managed to do to
>>> anywhere near the same degree) to ensure you don't get any data corruption
>>> when parts of chips/busses/memory/etc detect error situations, as you'd
>>> expect from an Enterprise grade OS. Then when it happens, someone says
>>>
>>> "I suspect it's some sort of software issue."
>>>
>>> ;-)
>>
>> You work for sun?
>
> Yes, well Oracle now, although I don't speak for them.
>
>> While I agree a machine with a nonrecoverable fault should just crash, I
>> will point out that writing software to just crash a machine over and over
>> again without any meaninful error output is in fact a sofware issue as
>> well.
>
> I agree. The fact that Solaris managed to record the necessary chip
> failure telemetry after a hardware failure which hit the system hard
> enough for it to be unable to dump and unable to recover even after
> a reset is quite remarkable. I don't think [m]any other OS's would

is "necessary chip failure telemetry" data that can only be decoded by
hitting a tech group on usenet and finding a sun employee?

Still, it's some PC platform in this case so I don't really expect awesome
diagnotics or failure recovery.

I still like the older RS/6000s that would log there was a power supply
fault and commit it to disk if you just pulled the plugs on the server.

that's impressive, and new stuff from sun still can't pull that off.