Prev: ditching mutt
Next: Ubuntu vs Debian forums (was recompiling the kernel with a different version name)
From: Paul E Condon on 10 Apr 2010 18:30 On 20100410_092044, Clive McBarton wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Paul E Condon wrote:> > > dumpe2fs -b <device> is supposed to print the bad blocks that have > > been marked on a device. When I run it, it prints nothing. I find it > > hard to believe that a 500GB HD contains ZERO bad blocks. > > Every HD that is even remotely close to being usable will always have > zero bad blocks when seen from outside the HD. All HDs have error > recognition and error correction and automatic replacement of faulty > sectors with spare ones. A HD will only show bad blocks after all of its > remapping area is used, at which point it is far beyond being usable. > > In other words, scanning for bad blocks on a HD cannot work. Thanks Clive. Your post has been invaluable in fixing some faulty thinking on my part, and in provoking other useful posts. But I want more ... The errors that I am experiencing are all similar. The first indication of a problem is a message from the kernel (I think). An example is: kernel: [78454.939948] journal commit I/O error This appears on all xterm windows on the affected machine. On the xterm that is controlling a process that is using one of the USB drives, there follows a long sequence of error messages about the drive being read-only, which stops after a while. Or sometimes I stop it by typing ^C on that xterm. When this happens, all the USB drives (3 of them) disappear from /dev/disks/by-label (they are all labeled by me). I have not discovered any to make them re-appear, short of rebooting the computer. After reboot, I run e2fsck on all of them, and always get a longish delay on each while e2fsck commits (or whatever) the journal. This can take a few seconds or up to half a minute. Then I manually mount them using pmount, and all data upto the point where the crash happened seems to be present. I have installed smartmontools, but I think there is some incompatibility between the installed version and the installed docs. The README.Debian makes reference to editing some lines in the config file that are not present in the default, package installed, config file. There is (apparently) some incompatibility between using the daemon and using smartctl. The problem host is running Lenny, but the docs seem to be the same as on a different host that is running Squeeze. I would very much appreciate some help in understanding the docs. What is a safe thing for a nubie to type as a first command to smartctl? -- Paul E Condon pecondon(a)mesanetworks.net -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100410222445.GC5664(a)big.lan.gnu
From: Paul E Condon on 11 Apr 2010 00:10 On 20100410_162445, Paul E Condon wrote: > On 20100410_092044, Clive McBarton wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Paul E Condon wrote:> > > > dumpe2fs -b <device> is supposed to print the bad blocks that have > > > been marked on a device. When I run it, it prints nothing. I find it > > > hard to believe that a 500GB HD contains ZERO bad blocks. > > > > Every HD that is even remotely close to being usable will always have > > zero bad blocks when seen from outside the HD. All HDs have error > > recognition and error correction and automatic replacement of faulty > > sectors with spare ones. A HD will only show bad blocks after all of its > > remapping area is used, at which point it is far beyond being usable. > > > > In other words, scanning for bad blocks on a HD cannot work. > > Thanks Clive. Your post has been invaluable in fixing some faulty thinking > on my part, and in provoking other useful posts. But I want more ... > > The errors that I am experiencing are all similar. The first > indication of a problem is a message from the kernel (I think). An > example is: > > kernel: [78454.939948] journal commit I/O error > > This appears on all xterm windows on the affected machine. On the > xterm that is controlling a process that is using one of the USB > drives, there follows a long sequence of error messages about the > drive being read-only, which stops after a while. Or sometimes I stop > it by typing ^C on that xterm. > > When this happens, all the USB drives (3 of them) disappear from > /dev/disks/by-label (they are all labeled by me). I have not > discovered any to make them re-appear, short of rebooting the > computer. After reboot, I run e2fsck on all of them, and always get a > longish delay on each while e2fsck commits (or whatever) the > journal. This can take a few seconds or up to half a minute. Then > I manually mount them using pmount, and all data upto the point > where the crash happened seems to be present. > > I have installed smartmontools, but I think there is some > incompatibility between the installed version and the installed > docs. The README.Debian makes reference to editing some lines in the > config file that are not present in the default, package installed, > config file. There is (apparently) some incompatibility between using > the daemon and using smartctl. The problem host is running Lenny, but > the docs seem to be the same as on a different host that is running > Squeeze. > > I would very much appreciate some help in understanding the docs. > What is a safe thing for a nubie to type as a first command to > smartctl? I'm answering my own post in order to bring some closure on this issue. If anyone has suggestions, please come forward. But here is where things stand with me: I got a little less timid and tried running smartctl even though I was quite unsure of what to expect. It ran. Each of the three USB HD gave somewhat different output, but none gave output that claimed there was a working SMART on the drive. These drives are Western Digital (WD). The WD web site mentions SMART and also uses the words Smart Drive to mean something else that is a proprietary marketing thing, AFAICT. I was unable to find a list of part #s for drives that support S.M.A.R.T. I think I should be in the market for a better class of drives, but not this weekend. Thanks for the help. -- Paul E Condon pecondon(a)mesanetworks.net -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100411040606.GC10774(a)big.lan.gnu
From: Celejar on 11 Apr 2010 00:30 On Sat, 10 Apr 2010 22:06:06 -0600 Paul E Condon <pecondon(a)mesanetworks.net> wrote: .... > I got a little less timid and tried running smartctl even though I was > quite unsure of what to expect. It ran. Each of the three USB HD gave > somewhat different output, but none gave output that claimed there was > a working SMART on the drive. These drives are Western Digital (WD). > The WD web site mentions SMART and also uses the words Smart Drive to > mean something else that is a proprietary marketing thing, AFAICT. I > was unable to find a list of part #s for drives that support S.M.A.R.T. > > I think I should be in the market for a better class of drives, but not > this weekend. Thanks for the help. My understanding is that S.M.A.R.T. doesn't generally work over USB. As Wikipedia puts it: For example, few external drives connected via USB and Firewire correctly send S.M.A.R.T. data over those interfaces. http://en.wikipedia.org/wiki/S.M.A.R.T.#Standards_and_implementation Celejar -- foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator mailmin.sourceforge.net - remote access via secure (OpenPGP) email ssuds.sourceforge.net - A Simple Sudoku Solver and Generator -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100411002510.020a145a.celejar(a)gmail.com
From: Paul E Condon on 11 Apr 2010 00:50 On 20100411_002510, Celejar wrote: > On Sat, 10 Apr 2010 22:06:06 -0600 > Paul E Condon <pecondon(a)mesanetworks.net> wrote: > > ... > > > I got a little less timid and tried running smartctl even though I was > > quite unsure of what to expect. It ran. Each of the three USB HD gave > > somewhat different output, but none gave output that claimed there was > > a working SMART on the drive. These drives are Western Digital (WD). > > The WD web site mentions SMART and also uses the words Smart Drive to > > mean something else that is a proprietary marketing thing, AFAICT. I > > was unable to find a list of part #s for drives that support S.M.A.R.T. > > > > I think I should be in the market for a better class of drives, but not > > this weekend. Thanks for the help. > > My understanding is that S.M.A.R.T. doesn't generally work over USB. > As Wikipedia puts it: > > For example, few external drives connected via USB and Firewire > correctly send S.M.A.R.T. data over those interfaces. > > http://en.wikipedia.org/wiki/S.M.A.R.T.#Standards_and_implementation > > Celejar So, the fact that my WD drives don't play well with S.M.A.R.T doesn't make them special, and I should not spend much, if any, time looking for a USB solution. What other options are there for external HD? -- Paul E Condon pecondon(a)mesanetworks.net -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100411044157.GD10774(a)big.lan.gnu
From: Celejar on 11 Apr 2010 01:00
On Sat, 10 Apr 2010 22:41:57 -0600 Paul E Condon <pecondon(a)mesanetworks.net> wrote: .... > So, the fact that my WD drives don't play well with S.M.A.R.T doesn't > make them special, and I should not spend much, if any, time looking > for a USB solution. What other options are there for external HD? I'm sorry, I don't really know much about this stuff. I'm just repeating what I've heard (and seen, in my very limited experience). Celejar -- foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator mailmin.sourceforge.net - remote access via secure (OpenPGP) email ssuds.sourceforge.net - A Simple Sudoku Solver and Generator -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100411005504.af4f8227.celejar(a)gmail.com |