From: Merciadri Luca on
Hi,

When a computer stays turned on for a long amount of time, some problems
could arise. I have the following questions:

1. What habitually makes a computer 'running Linux) go down (except
electric problems)?
2. What are Debian/kernel's adaptations to prevent such problems from
arising?

Thanks.

--
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.




From: Bob Proulx on
Merciadri Luca wrote:
> When a computer stays turned on for a long amount of time, some problems
> could arise. I have the following questions:
>
> 1. What habitually makes a computer 'running Linux) go down (except
> electric problems)?

One possibility is soft memory errors in the RAM. Using ECC RAM
reduces the likely to a vanishingly small probability and has long
been the normal hardware for high quality systems. But cheaper
commodity hardware desktops designed to run a well known commercial OS
these days uses cheaper non-ECC RAM since it doesn't make sense to be
more reliable than the target OS. Linux tends to expose these systems
since it is very reliable in and of itself.

Another possibility is a kernel bug either in the main kernel core or
in a device driver. Device drivers have a higher incidence of bugs
especially for unique low use niche hardware. Popular devices are
better tested than hardware that only three people in the world use.
But kernel bugs in the public code are rare. When found those are
fixed pretty quickly.

A third more common case is when closed source proprietary drivers are
loaded into the kernel such as a vendor's graphics driver. Because
those are closed source there isn't any public review. Bugs there are
frustrating to all since they can't be debugged by anyone but the
vendor and the vendor doesn't usually have access to your system nor
motivation to debug your system. For high reliability you should
avoid loading any closed source proprietary driver, unless you
yourself are the author of it.

> 2. What are Debian/kernel's adaptations to prevent such problems from
> arising?

Debian adds few patches over and above the standard stock Linux
kernel. Basically in this area (as far as I know) you are getting the
stock upstream Linux kernel capability.

Bob
From: Merciadri Luca on
Bob Proulx wrote:
> Merciadri Luca wrote:
>
>> When a computer stays turned on for a long amount of time, some problems
>> could arise. I have the following questions:
>>
>> 1. What habitually makes a computer 'running Linux) go down (except
>> electric problems)?
>>
>
> One possibility is soft memory errors in the RAM. Using ECC RAM
> reduces the likely to a vanishingly small probability and has long
> been the normal hardware for high quality systems. But cheaper
> commodity hardware desktops designed to run a well known commercial OS
> these days uses cheaper non-ECC RAM since it doesn't make sense to be
> more reliable than the target OS. Linux tends to expose these systems
> since it is very reliable in and of itself.
>
Thanks.
> Another possibility is a kernel bug either in the main kernel core or
> in a device driver. Device drivers have a higher incidence of bugs
> especially for unique low use niche hardware. Popular devices are
> better tested than hardware that only three people in the world use.
> But kernel bugs in the public code are rare. When found those are
> fixed pretty quickly.
>
Okay.
> A third more common case is when closed source proprietary drivers are
> loaded into the kernel such as a vendor's graphics driver. Because
> those are closed source there isn't any public review. Bugs there are
> frustrating to all since they can't be debugged by anyone but the
> vendor and the vendor doesn't usually have access to your system nor
> motivation to debug your system. For high reliability you should
> avoid loading any closed source proprietary driver, unless you
> yourself are the author of it.
>
Sure.
>> 2. What are Debian/kernel's adaptations to prevent such problems from
>> arising?
>>
> Debian adds few patches over and above the standard stock Linux
> kernel. Basically in this area (as far as I know) you are getting the
> stock upstream Linux kernel capability.
>
Okay. Well, I'm using, on the computer which `should never fail,' Silcon
Labs' drivers for a Davis Vantage Pro 2 Wireless station (which is
connected to the computer by USB). I'm not sure if they are the CP210x
(because this is not really UART, at least to me), but they seem to be
pretty correctly integrated into the kernel. I should not worry like
this, but for a computer which needs to be turned on 24h/24, 7d/7, etc.,
it's an important thing because, for meteorological data capture, we
shall all depend on the computer's `good-will.'

--
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.


All flowers are not in one garden.

From: Bob Proulx on
Merciadri Luca wrote:
> pretty correctly integrated into the kernel. I should not worry like
> this, but for a computer which needs to be turned on 24h/24, 7d/7, etc.,
> it's an important thing because, for meteorological data capture, we
> shall all depend on the computer's `good-will.'

I have personally seen Linux based systems run for several years
without a reboot. Of course that isn't recommended since security
vulnerabilities are usually patched at least a few times a year and
security upgrades should be installed and rebooted to activate. But
just the same old "uptime wars" have been around for a while and
machines running for years without a reboot are not that unusual.

If uptime is critical, critical, critical then you would need to plan
for it and use more redundancy. But for most applications it is good
enough to plan scheduled reboots for installations of security
upgrades. It is all your choice.

Bob
From: Alan Chandler on
On 05/07/10 22:22, Merciadri Luca wrote:
> Hi,
>
> When a computer stays turned on for a long amount of time, some problems
> could arise. I have the following questions:
>
> 1. What habitually makes a computer 'running Linux) go down (except
> electric problems)?
> 2. What are Debian/kernel's adaptations to prevent such problems from
> arising?
>
> Thanks.
>


I have had a Debian based server at home running 24/7 for about 7 or 8
years. Living near London, we seem to have a reasonably stable
electricity supply (I don't do anything special) and I have had uptimes
of nearly a year, with the only downtime in the year being when my wife
made me turn it off as we went on holiday.

Of course upgrades to the kernel have required bringing it down, and
more recently I have had several updgrades of hardware to increase disk
space and increase performance with a faster processes so I could build
in mythtv.

From November 2009 to 10 days ago I went through a phase of using a
Sheeva plug computer as the server (in order to use less electricity),
but I am sad to say its (hardware) reliability wasn't good and I have
now abandoned that. Putting the old server back it has now been running
24/7 for the last 10 days.

It runs exim and apache and mythtv as the three key applications
supported by both mysql and postgres, but it is also a git server, dns
server (along with dhcp and tftp - using dnsmasq), time server,
firewall/router/internet gateway and automatically supports most of the
backups via cron (when it is master of the process) or via rsyncd (when
it is the slave). I have never seen a memory leak or anything else that
has required attention.


--
Alan Chandler
http://www.chandlerfamily.org.uk


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C32D150.70601(a)chandlerfamily.org.uk