Prev: low power linux server
Next: MSI NF980-G65 Motherboard
From: Piotr Szymański on 1 Nov 2009 08:50 Hi All, I have two Seagate Barracuda 7200.12 1 TB (ST31000528AS) drives in a Linux software RAID-1 configuration. Today I've got a notification from smartd that one of the drives (sda) is failing: Device: /dev/sda, ATA error count increased from 0 to 6 Some other log messages (like: "ata1.00: cmd ... Emask 0x409 (media error)", "end_request: I/O error, dev sda, sector 39072000") and the disk's SMART error log seem to confirm that the disk is dying. My problem is that I'm seeing SMART warnings about the other drive too: smartd[5845]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 108 to 117 Below is the listing of SMART attributes for the good drive (smartctl -A /dev/sdb): === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 113 099 006 Pre-fail Always - 52634145 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 56 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 24 7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail Always - 35530576 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3861 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 56 183 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 0 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Unknown_Attribute 0x0032 100 099 000 Old_age Always - 1 189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1 190 Airflow_Temperature_Cel 0x0022 067 059 045 Old_age Always - 33 (Lifetime Min/Max 32/41) 194 Temperature_Celsius 0x0022 033 041 000 Old_age Always - 33 (0 19 0 0) 195 Hardware_ECC_Recovered 0x001a 036 015 000 Old_age Always - 52634145 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 91955249811373 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 1261294398 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 1519044357 And here is the listing for the bad drive: === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 109 100 006 Pre-fail Always - 23028010 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 59 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 17 7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail Always - 81078197 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 3861 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 59 183 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 0 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 094 094 000 Old_age Always - 6 188 Unknown_Attribute 0x0032 100 096 000 Old_age Always - 26 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 070 062 045 Old_age Always - 30 (Lifetime Min/Max 29/38) 194 Temperature_Celsius 0x0022 030 040 000 Old_age Always - 30 (0 19 0 0) 195 Hardware_ECC_Recovered 0x001a 041 022 000 Old_age Always - 23028010 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 82240033787773 241 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 2371531202 242 Unknown_Attribute 0x0000 100 253 000 Old_age Offline - 3348144171 Both have a nonzero Reallocated_Sector_Ct and Seek_Error_Rate. I cannot run an extended SMART test on the drive as due to some firmware problem it doesn't move past 10% completion. Do you think the other drive is failing also? Thanks! -- Peter Szyma�ski <szyman(at)magres.net>
From: philo on 1 Nov 2009 11:11 Piotr Szyma�ski wrote: > Hi All, > > I have two Seagate Barracuda 7200.12 1 TB (ST31000528AS) drives in a > Linux software RAID-1 configuration. Today I've got a notification from > smartd that one of the drives (sda) is failing: > > Device: /dev/sda, ATA error count increased from 0 to 6 > > Some other log messages (like: "ata1.00: cmd ... Emask 0x409 (media > error)", "end_request: I/O error, dev sda, sector 39072000") and the > disk's SMART error log seem to confirm that the disk is dying. My > problem is that I'm seeing SMART warnings about the other drive too: > > smartd[5845]: Device: /dev/sdb, SMART Prefailure Attribute: 1 > Raw_Read_Error_Rate changed from 108 to 117 > > Below is the listing of SMART attributes for the good drive (smartctl -A > /dev/sdb): > > === START OF READ SMART DATA SECTION === > SMART Attributes Data Structure revision number: 10 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED > WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 113 099 006 Pre-fail > Always - 52634145 > 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail > Always - 0 > 4 Start_Stop_Count 0x0032 100 100 020 Old_age > Always - 56 > 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail > Always - 24 > 7 Seek_Error_Rate 0x000f 075 060 030 Pre-fail > Always - 35530576 > 9 Power_On_Hours 0x0032 096 096 000 Old_age > Always - 3861 > 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 020 Old_age > Always - 56 > 183 Unknown_Attribute 0x0000 100 100 000 Old_age > Offline - 0 > 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always > - 0 > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always > - 0 > 188 Unknown_Attribute 0x0032 100 099 000 Old_age Always > - 1 > 189 High_Fly_Writes 0x003a 099 099 000 Old_age Always > - 1 > 190 Airflow_Temperature_Cel 0x0022 067 059 045 Old_age Always > - 33 (Lifetime Min/Max 32/41) > 194 Temperature_Celsius 0x0022 033 041 000 Old_age Always > - 33 (0 19 0 0) > 195 Hardware_ECC_Recovered 0x001a 036 015 000 Old_age Always > - 52634145 > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always > - 0 > 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always > - 0 > 240 Head_Flying_Hours 0x0000 100 253 000 Old_age > Offline - 91955249811373 > 241 Unknown_Attribute 0x0000 100 253 000 Old_age > Offline - 1261294398 > 242 Unknown_Attribute 0x0000 100 253 000 Old_age > Offline - 1519044357 > > And here is the listing for the bad drive: > > === START OF READ SMART DATA SECTION === > SMART Attributes Data Structure revision number: 10 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED > WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 109 100 006 Pre-fail > Always - 23028010 > 3 Spin_Up_Time 0x0003 095 095 000 Pre-fail > Always - 0 > 4 Start_Stop_Count 0x0032 100 100 020 Old_age > Always - 59 > 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail > Always - 17 > 7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail > Always - 81078197 > 9 Power_On_Hours 0x0032 096 096 000 Old_age > Always - 3861 > 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 020 Old_age > Always - 59 > 183 Unknown_Attribute 0x0000 100 100 000 Old_age > Offline - 0 > 184 Unknown_Attribute 0x0032 100 100 099 Old_age Always > - 0 > 187 Reported_Uncorrect 0x0032 094 094 000 Old_age Always > - 6 > 188 Unknown_Attribute 0x0032 100 096 000 Old_age Always > - 26 > 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always > - 0 > 190 Airflow_Temperature_Cel 0x0022 070 062 045 Old_age Always > - 30 (Lifetime Min/Max 29/38) > 194 Temperature_Celsius 0x0022 030 040 000 Old_age Always > - 30 (0 19 0 0) > 195 Hardware_ECC_Recovered 0x001a 041 022 000 Old_age Always > - 23028010 > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always > - 0 > 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always > - 0 > 240 Head_Flying_Hours 0x0000 100 253 000 Old_age > Offline - 82240033787773 > 241 Unknown_Attribute 0x0000 100 253 000 Old_age > Offline - 2371531202 > 242 Unknown_Attribute 0x0000 100 253 000 Old_age > Offline - 3348144171 > > Both have a nonzero Reallocated_Sector_Ct and Seek_Error_Rate. > I cannot run an extended SMART test on the drive as due to some firmware > problem it doesn't move past 10% completion. > > Do you think the other drive is failing also? > > Thanks! > Replace the drive at once! Do not fool with it any more Just because one drive is bad...it does not necessarily mean the other one is bad too
From: root on 1 Nov 2009 12:24 Piotr Szyma�ski <szyman(a)REMOVETHISmagres.net> wrote: > Hi All, > > I have two Seagate Barracuda 7200.12 1 TB (ST31000528AS) drives in a > Linux software RAID-1 configuration. Today I've got a notification from > smartd that one of the drives (sda) is failing: > I had two of the 1Tb drives fail within a week of purchase. Send them back to Seagate for replacement. When you call Seagate they will warn you that they may reject your drive if you don't pack it correctly. I simply packed the first drive in the original box and returned it. They took it and returned the drive in a big box with lots of foam around the drive. I returned the second drive in the box they sent. Since then I have had no problems with the replacement drives. Something rotten about the first 1Tb drives. PS if you opt for them to send you a drive before they get your drive you will get hit with a $25 shipping charge. The UPS shipping for one drive is about $9.
From: philo on 1 Nov 2009 13:57 root wrote: > Piotr Szyma�ski <szyman(a)REMOVETHISmagres.net> wrote: >> Hi All, >> >> I have two Seagate Barracuda 7200.12 1 TB (ST31000528AS) drives in a >> Linux software RAID-1 configuration. Today I've got a notification from >> smartd that one of the drives (sda) is failing: >> > > I had two of the 1Tb drives fail within a week of purchase. > Send them back to Seagate for replacement. When you call > Seagate they will warn you that they may reject your drive > if you don't pack it correctly. I simply packed the first > drive in the original box and returned it. They took it > and returned the drive in a big box with lots of foam around > the drive. I returned the second drive in the box they > sent. Since then I have had no problems with the replacement > drives. Something rotten about the first 1Tb drives. > > PS if you opt for them to send you a drive before they > get your drive you will get hit with a $25 shipping charge. > The UPS shipping for one drive is about $9. It may be hard to warranty a drive that has not yet failed... unless there's a known manufacturing defect... but worth checking into
From: Joe on 1 Nov 2009 14:31
On 2009-11-01, philo <philo(a)privacy.invalid> wrote: > root wrote: >> Piotr Szymañski <szyman(a)REMOVETHISmagres.net> wrote: >>> Hi All, >>> >>> I have two Seagate Barracuda 7200.12 1 TB (ST31000528AS) drives in a >>> Linux software RAID-1 configuration. Today I've got a notification from >>> smartd that one of the drives (sda) is failing: >>> >> >> I had two of the 1Tb drives fail within a week of purchase. >> Send them back to Seagate for replacement. When you call >> Seagate they will warn you that they may reject your drive >> if you don't pack it correctly. I simply packed the first >> drive in the original box and returned it. They took it >> and returned the drive in a big box with lots of foam around >> the drive. I returned the second drive in the box they >> sent. Since then I have had no problems with the replacement >> drives. Something rotten about the first 1Tb drives. >> >> PS if you opt for them to send you a drive before they >> get your drive you will get hit with a $25 shipping charge. >> The UPS shipping for one drive is about $9. > > > > It may be hard to warranty a drive that has not yet failed... > unless there's a known manufacturing defect... > but worth checking into Not at all. Warranty covers SMART failures on every drive I've dealt with... -- Joe - Linux User #449481/Ubuntu User #19733 joe at hits - buffalo dot com "Hate is baggage, life is too short to go around pissed off all the time..." - Danny, American History X |