Prev: VMWare tools killed my Mac OS X ?
Next: Software vs hardware floating-point [was Re: What happened ...]
From: nmm1 on 6 Oct 2009 05:03 In article <7ivv9cF335jhvU1(a)mid.individual.net>, Del Cecchi <delcecchinospamofthenorth(a)gmail.com> wrote: > >The Origin didn't have a service processor to handle things like power >on and off? I am shocked and appalled. Yes, it did. That wasn't the problem. What I did was hammer it hard enough that the CPUs jammed solid in the firmware, which was then no longer listening to the NMI channel. It only happened a couple of times - normally, powering on and off via the console worked. Incidentally, I did something very similar to an IBM SP3, for very different reasons. I am pretty sure that was a straight misdesign in the controller software. That was very irritating, because I wasn't stress-testing it at the time. As far as I recall, I never managed it on the Hitachi SR2201 or Sun F15K. Regards, Nick Maclaren.
From: Del Cecchi on 6 Oct 2009 23:05 nmm1(a)cam.ac.uk wrote: > In article <7ivv9cF335jhvU1(a)mid.individual.net>, > Del Cecchi <delcecchinospamofthenorth(a)gmail.com> wrote: >> The Origin didn't have a service processor to handle things like power >> on and off? I am shocked and appalled. > > Yes, it did. That wasn't the problem. > > What I did was hammer it hard enough that the CPUs jammed solid in > the firmware, which was then no longer listening to the NMI channel. > It only happened a couple of times - normally, powering on and off > via the console worked. > > Incidentally, I did something very similar to an IBM SP3, for very > different reasons. I am pretty sure that was a straight misdesign > in the controller software. That was very irritating, because I > wasn't stress-testing it at the time. > > As far as I recall, I never managed it on the Hitachi SR2201 or > Sun F15K. > > > Regards, > Nick Maclaren. I think the modern thing to do is have the service processor interface to the switches and control IPL and power on and off. So it shouldn't matter if the main processor is wedged or vaporized. the service processor can snapshot stuff by scanning the LSSD registers and handle the power stuff. NMI, whatever. turn the power off, or scan the appropriate data in (equivilent of POR) and you are good to go. I guess someone was trying to save a few bucks or something. del
From: nmm1 on 7 Oct 2009 03:07
In article <7j2eqjF33tntoU1(a)mid.individual.net>, Del Cecchi <delcecchinospamofthenorth(a)gmail.com> wrote: > >I think the modern thing to do is have the service processor interface >to the switches and control IPL and power on and off. So it shouldn't >matter if the main processor is wedged or vaporized. the service >processor can snapshot stuff by scanning the LSSD registers and handle >the power stuff. NMI, whatever. turn the power off, or scan the >appropriate data in (equivilent of POR) and you are good to go. The Origin was some time ago now. It is also possible that, as with the similar event on the SP3, I had managed to lock up the on-rack service processor (assuming there was one). A very common low-level hardware misdesign is for a 'message' between two chips to hang, uninterruptibly, if the other chip is dead in the water. That phenomenon may (or may not) have been involved in either or both cases. As you can imagine, I didn't waste time trying to find out exactly what had happened :-) Regards, Nick Maclaren. |