From: Paul E Condon on
On 20100410_092044, Clive McBarton wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Paul E Condon wrote:>
> > dumpe2fs -b <device> is supposed to print the bad blocks that have
> > been marked on a device. When I run it, it prints nothing. I find it
> > hard to believe that a 500GB HD contains ZERO bad blocks.
>
> Every HD that is even remotely close to being usable will always have
> zero bad blocks when seen from outside the HD. All HDs have error
> recognition and error correction and automatic replacement of faulty
> sectors with spare ones. A HD will only show bad blocks after all of its
> remapping area is used, at which point it is far beyond being usable.
>
> In other words, scanning for bad blocks on a HD cannot work.

Thanks Clive. Your post has been invaluable in fixing some faulty thinking
on my part, and in provoking other useful posts. But I want more ...

The errors that I am experiencing are all similar. The first
indication of a problem is a message from the kernel (I think). An
example is:

kernel: [78454.939948] journal commit I/O error

This appears on all xterm windows on the affected machine. On the
xterm that is controlling a process that is using one of the USB
drives, there follows a long sequence of error messages about the
drive being read-only, which stops after a while. Or sometimes I stop
it by typing ^C on that xterm.

When this happens, all the USB drives (3 of them) disappear from
/dev/disks/by-label (they are all labeled by me). I have not
discovered any to make them re-appear, short of rebooting the
computer. After reboot, I run e2fsck on all of them, and always get a
longish delay on each while e2fsck commits (or whatever) the
journal. This can take a few seconds or up to half a minute. Then
I manually mount them using pmount, and all data upto the point
where the crash happened seems to be present.

I have installed smartmontools, but I think there is some
incompatibility between the installed version and the installed
docs. The README.Debian makes reference to editing some lines in the
config file that are not present in the default, package installed,
config file. There is (apparently) some incompatibility between using
the daemon and using smartctl. The problem host is running Lenny, but
the docs seem to be the same as on a different host that is running
Squeeze.

I would very much appreciate some help in understanding the docs.
What is a safe thing for a nubie to type as a first command to
smartctl?

--
Paul E Condon
pecondon(a)mesanetworks.net


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100410222445.GC5664(a)big.lan.gnu
From: Paul E Condon on
On 20100410_162445, Paul E Condon wrote:
> On 20100410_092044, Clive McBarton wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Paul E Condon wrote:>
> > > dumpe2fs -b <device> is supposed to print the bad blocks that have
> > > been marked on a device. When I run it, it prints nothing. I find it
> > > hard to believe that a 500GB HD contains ZERO bad blocks.
> >
> > Every HD that is even remotely close to being usable will always have
> > zero bad blocks when seen from outside the HD. All HDs have error
> > recognition and error correction and automatic replacement of faulty
> > sectors with spare ones. A HD will only show bad blocks after all of its
> > remapping area is used, at which point it is far beyond being usable.
> >
> > In other words, scanning for bad blocks on a HD cannot work.
>
> Thanks Clive. Your post has been invaluable in fixing some faulty thinking
> on my part, and in provoking other useful posts. But I want more ...
>
> The errors that I am experiencing are all similar. The first
> indication of a problem is a message from the kernel (I think). An
> example is:
>
> kernel: [78454.939948] journal commit I/O error
>
> This appears on all xterm windows on the affected machine. On the
> xterm that is controlling a process that is using one of the USB
> drives, there follows a long sequence of error messages about the
> drive being read-only, which stops after a while. Or sometimes I stop
> it by typing ^C on that xterm.
>
> When this happens, all the USB drives (3 of them) disappear from
> /dev/disks/by-label (they are all labeled by me). I have not
> discovered any to make them re-appear, short of rebooting the
> computer. After reboot, I run e2fsck on all of them, and always get a
> longish delay on each while e2fsck commits (or whatever) the
> journal. This can take a few seconds or up to half a minute. Then
> I manually mount them using pmount, and all data upto the point
> where the crash happened seems to be present.
>
> I have installed smartmontools, but I think there is some
> incompatibility between the installed version and the installed
> docs. The README.Debian makes reference to editing some lines in the
> config file that are not present in the default, package installed,
> config file. There is (apparently) some incompatibility between using
> the daemon and using smartctl. The problem host is running Lenny, but
> the docs seem to be the same as on a different host that is running
> Squeeze.
>
> I would very much appreciate some help in understanding the docs.
> What is a safe thing for a nubie to type as a first command to
> smartctl?

I'm answering my own post in order to bring some closure on this issue.
If anyone has suggestions, please come forward. But here is where things
stand with me:

I got a little less timid and tried running smartctl even though I was
quite unsure of what to expect. It ran. Each of the three USB HD gave
somewhat different output, but none gave output that claimed there was
a working SMART on the drive. These drives are Western Digital (WD).
The WD web site mentions SMART and also uses the words Smart Drive to
mean something else that is a proprietary marketing thing, AFAICT. I
was unable to find a list of part #s for drives that support S.M.A.R.T.

I think I should be in the market for a better class of drives, but not
this weekend. Thanks for the help.

--
Paul E Condon
pecondon(a)mesanetworks.net


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100411040606.GC10774(a)big.lan.gnu
From: Celejar on
On Sat, 10 Apr 2010 22:06:06 -0600
Paul E Condon <pecondon(a)mesanetworks.net> wrote:

....

> I got a little less timid and tried running smartctl even though I was
> quite unsure of what to expect. It ran. Each of the three USB HD gave
> somewhat different output, but none gave output that claimed there was
> a working SMART on the drive. These drives are Western Digital (WD).
> The WD web site mentions SMART and also uses the words Smart Drive to
> mean something else that is a proprietary marketing thing, AFAICT. I
> was unable to find a list of part #s for drives that support S.M.A.R.T.
>
> I think I should be in the market for a better class of drives, but not
> this weekend. Thanks for the help.

My understanding is that S.M.A.R.T. doesn't generally work over USB.
As Wikipedia puts it:

For example, few external drives connected via USB and Firewire
correctly send S.M.A.R.T. data over those interfaces.

http://en.wikipedia.org/wiki/S.M.A.R.T.#Standards_and_implementation

Celejar
--
foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator
mailmin.sourceforge.net - remote access via secure (OpenPGP) email
ssuds.sourceforge.net - A Simple Sudoku Solver and Generator


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100411002510.020a145a.celejar(a)gmail.com
From: Paul E Condon on
On 20100411_002510, Celejar wrote:
> On Sat, 10 Apr 2010 22:06:06 -0600
> Paul E Condon <pecondon(a)mesanetworks.net> wrote:
>
> ...
>
> > I got a little less timid and tried running smartctl even though I was
> > quite unsure of what to expect. It ran. Each of the three USB HD gave
> > somewhat different output, but none gave output that claimed there was
> > a working SMART on the drive. These drives are Western Digital (WD).
> > The WD web site mentions SMART and also uses the words Smart Drive to
> > mean something else that is a proprietary marketing thing, AFAICT. I
> > was unable to find a list of part #s for drives that support S.M.A.R.T.
> >
> > I think I should be in the market for a better class of drives, but not
> > this weekend. Thanks for the help.
>
> My understanding is that S.M.A.R.T. doesn't generally work over USB.
> As Wikipedia puts it:
>
> For example, few external drives connected via USB and Firewire
> correctly send S.M.A.R.T. data over those interfaces.
>
> http://en.wikipedia.org/wiki/S.M.A.R.T.#Standards_and_implementation
>
> Celejar

So, the fact that my WD drives don't play well with S.M.A.R.T doesn't
make them special, and I should not spend much, if any, time looking
for a USB solution. What other options are there for external HD?

--
Paul E Condon
pecondon(a)mesanetworks.net


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100411044157.GD10774(a)big.lan.gnu
From: Celejar on
On Sat, 10 Apr 2010 22:41:57 -0600
Paul E Condon <pecondon(a)mesanetworks.net> wrote:

....

> So, the fact that my WD drives don't play well with S.M.A.R.T doesn't
> make them special, and I should not spend much, if any, time looking
> for a USB solution. What other options are there for external HD?

I'm sorry, I don't really know much about this stuff. I'm just
repeating what I've heard (and seen, in my very limited experience).

Celejar
--
foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator
mailmin.sourceforge.net - remote access via secure (OpenPGP) email
ssuds.sourceforge.net - A Simple Sudoku Solver and Generator


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100411005504.af4f8227.celejar(a)gmail.com