From: Egrama on
Hi guys,

On a T5220 and a T5240 I noticed this strange behaviour: sometimes the
machine is freezing for a couple of secconds and then continues
working as if nothing happened. I noticed this because we are running
some realtime application and 2 secconds delays in processing
triggers alarms.
The machine CPU load is around 30% and also the memory.
I would say this is not a system related problem, but I noticed the
problem first hand when my terminal just hung and then the application
alerts came.
Has anybody experienced anything similar? I have no errors whatsoever
in the system logs.....
Any idea how to investigate this without a major performance impact ?

Thanks,
Emil
From: Richard B. Gilbert on
Egrama wrote:
> Hi guys,
>
> On a T5220 and a T5240 I noticed this strange behavior: sometimes the
> machine is freezing for a couple of seconds and then continues
> working as if nothing happened. I noticed this because we are running
> some realtime application and 2 seconds delays in processing
> triggers alarms.
> The machine CPU load is around 30% and also the memory.

Remember that the "CPU load" is an average over time. There is nothing
in that "30%" that precludes the CPU from being 100% busy for a few seconds.

ISTR something about "real time" priorities. I never had cause to use
them but: see
http://www.princeton.edu/~unix/Solaris/troubleshoot/schedule.html

or

Google!
From: Chris on
On Jan 22, 1:20 pm, Drazen Kacar <d...(a)fly.srk.fer.hr> wrote:
> Egrama wrote:
> >  On a T5220 and a T5240 I noticed this strange behaviour: sometimes the
> >  machine is freezing for a couple of secconds and then continues
> >  working as if nothing happened. I noticed this because we are running
> >  some realtime application and 2 secconds delays in processing
> >  triggers alarms.
> >  The machine CPU load is around 30% and also the memory.
> >  I would say this is not a system related problem, but I noticed the
> >  problem first hand when my terminal just hung and then the application
> >  alerts came.
> >  Has anybody experienced anything similar? I have no errors whatsoever
> >  in the system logs.....

I have a call open to Sun support on that very issue. I've been
through several tech support people and haven't nailed it down yet.
However, it seems clear to me that it is a network interface problem.
There was a patch posted in mid December for a bug in the e1000g
driver that said the chipset would freeze up under certain
cirumstances. That patch didn't fix my problem. What I see is that
anything using the network interface is momentarily unreachable,
including a StorageTek 2510 iSCSI array direct attached to e1000g2. I
see freezeups in ssh terminal connections, I see alerts from mon which
is poking ping, http, https, drupal, etc, and I see alerts from Common
Array Manager claiming to have lost iSCSI connections. Meanwhile
serial connection to the console via ILOM is just dandy, and the
system seems perfectly responsive when looked at through that
connection. The load on the system is a miniscule fraction of what it
should be capable of. We haven't even ramped it up yet. It is supposed
to take over from an E250, but, at the moment, we are more comfortable
leaving a lot of our stuff on the E250, which, in principle, ought to
be 100 times slower or more.

From: Michael Laajanen on
Hi,

Egrama wrote:
> Hi guys,
>
> On a T5220 and a T5240 I noticed this strange behaviour: sometimes the
> machine is freezing for a couple of secconds and then continues
> working as if nothing happened. I noticed this because we are running
> some realtime application and 2 secconds delays in processing
> triggers alarms.
> The machine CPU load is around 30% and also the memory.
> I would say this is not a system related problem, but I noticed the
> problem first hand when my terminal just hung and then the application
> alerts came.
> Has anybody experienced anything similar? I have no errors whatsoever
> in the system logs.....
> Any idea how to investigate this without a major performance impact ?
>
> Thanks,
> Emil
I don't know much about this exept from realtime apps :) but it sound
like something is falling to sleep or a garbage collection but on a OS
level I don't think its a garbage collction :) could it be a disk that
has powered down?

/michael
From: Horst Scheuermann on
Am Fri, 22 Jan 2010 14:01:30 -0800 schrieb Chris:

> On Jan 22, 1:20 pm, Drazen Kacar <d...(a)fly.srk.fer.hr> wrote:
>> Egrama wrote:
>> >  On a T5220 and a T5240 I noticed this strange behaviour: sometimes the
>> >  machine is freezing for a couple of secconds and then continues
>> >  working as if nothing happened. I noticed this because we are running
>> >  some realtime application and 2 secconds delays in processing
>> >  triggers alarms.
>> >  The machine CPU load is around 30% and also the memory.
>> >  I would say this is not a system related problem, but I noticed the
>> >  problem first hand when my terminal just hung and then the application
>> >  alerts came.
>> >  Has anybody experienced anything similar? I have no errors whatsoever
>> >  in the system logs.....
>
> I have a call open to Sun support on that very issue. I've been
> through several tech support people and haven't nailed it down yet.
> However, it seems clear to me that it is a network interface problem.
> There was a patch posted in mid December for a bug in the e1000g
> driver that said the chipset would freeze up under certain
> cirumstances. That patch didn't fix my problem. What I see is that
> anything using the network interface is momentarily unreachable,
> including a StorageTek 2510 iSCSI array direct attached to e1000g2. I
> see freezeups in ssh terminal connections, I see alerts from mon which
> is poking ping, http, https, drupal, etc, and I see alerts from Common
> Array Manager claiming to have lost iSCSI connections. Meanwhile
> serial connection to the console via ILOM is just dandy, and the
> system seems perfectly responsive when looked at through that
> connection. The load on the system is a miniscule fraction of what it
> should be capable of. We haven't even ramped it up yet. It is supposed
> to take over from an E250, but, at the moment, we are more comfortable
> leaving a lot of our stuff on the E250, which, in principle, ought to
> be 100 times slower or more.

we had similar problems with X4500, the patch 141445-09 seams to help

--
11. Gebot: Wenn Du eine Fahrradklingel hörst, dreh Dich um, reiße
Mund, Nase und Augen auf, trete aber keinesfalls zur Seite.