Prev: How to recover boot floppy in Lenny
Next: How to migrate my localhost php site to my ISP - Was: willing to learn php basics
From: Glenn English on 15 Apr 2010 22:20 SMART is warning a lot about a couple week old OCZ SSD. This is my first SSD, and I don't know if this is normal or I have a defective drive. Looks broken to me, but... It's connected to an Asus AT3CG-I mobo. Some output from webmin -- smart's writing in the logs quite a bit, too, about uncorrectable errors and such: smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: OCZ-VERTEX Serial Number: 26G39UFKZSO46VR1E334 Firmware Version: 1.5 User Capacity: 32,017,047,552 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Apr 15 20:02:18 2010 MDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x1d) SMART execute Offline immediate. No Auto Offline data collection support. Abort Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x00) Error logging NOT supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 0) minutes. Extended self-test routine recommended polling time: ( 0) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x0000 007 000 000 Old_age Offline In_the_past 0 9 Power_On_Hours 0x0000 155 001 000 Old_age Offline - 0 12 Power_Cycle_Count 0x0000 012 000 000 Old_age Offline In_the_past 0 184 Unknown_Attribute 0x0000 001 000 000 Old_age Offline In_the_past 0 195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 198 Offline_Uncorrectable 0x0000 070 114 000 Old_age Offline - 1529 199 UDMA_CRC_Error_Count 0x0000 210 179 000 Old_age Offline - 609 200 Multi_Zone_Error_Rate 0x0000 096 078 000 Old_age Offline - 17 201 Soft_Read_Error_Rate 0x0000 193 151 000 Old_age Offline - 19 202 TA_Increase_Count 0x0000 103 105 000 Old_age Offline - 0 203 Run_Out_Cancel 0x0000 102 105 000 Old_age Offline - 0 204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 205 Shock_Rate_Write_Opern 0x0000 016 039 000 Old_age Offline - 0 206 Flying_Height 0x0000 001 000 000 Old_age Offline In_the_past 0 207 Spin_High_Current 0x0000 115 001 000 Old_age Offline - 0 208 Spin_Buzz 0x0000 115 000 000 Old_age Offline In_the_past 0 209 Offline_Seek_Performnce 0x0000 099 000 000 Old_age Offline In_the_past 0 211 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 212 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 213 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 Warning: device does not support Error Logging Warning! SMART ATA Error Log Structure error: invalid SMART checksum. SMART Error Log Version: 1 No Errors Logged Warning! SMART Self-Test Log Structure error: invalid SMART checksum. SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] -- Glenn English ghe(a)slsware.com -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/7F8176EA-AA0A-4EC2-BEA1-F0441DE2141F(a)slsware.com
From: Stan Hoeppner on 16 Apr 2010 02:20 Glenn English put forth on 4/15/2010 9:10 PM: > 195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 211 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 212 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 213 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 Note particularly the last three. Smartctl doesn't know what those are, but it somehow knows they are failing? With a raw error count of zero, no less? You don't have a bad SSD Glenn. Apparently the S.M.A.R.T data structures/attributes for SSD have yet to be standardized as they have been for mechanical disks. Many of the mechanical disk S.M.A.R.T attributes don't exist for SSD and vice versa. Newer versions of the Linux smart tools may fix this problem, or you could hack up some file tables yourself to get your tools to understand that OCZ SSD's S.M.A.R.T attributes. Your SSD is fine. I think the Linux S.M.A.R.T tools aren't up to speed yet WRT SSDs. -- Stan -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4BC7FFDA.8010108(a)hardwarefreak.com
From: Jochen Schulz on 16 Apr 2010 02:30 Glenn English: > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x0000 007 000 000 Old_age Offline In_the_past 0 > 9 Power_On_Hours 0x0000 155 001 000 Old_age Offline - 0 > 12 Power_Cycle_Count 0x0000 012 000 000 Old_age Offline In_the_past 0 > 184 Unknown_Attribute 0x0000 001 000 000 Old_age Offline In_the_past 0 > 195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 The question is whether these values have been like this from the start. I think it is a little bit unusual for the drive to have a threshold of zero for 195-197. And it makes me suspicious that not even attributes 9 and 12 are counted properly. > 204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0 > 206 Flying_Height 0x0000 001 000 000 Old_age Offline In_the_past 0 I think both of these values don't matter to SSDs at all. They shouldn't care about shocks and don't have a head which could "fly high". Conclusion: While the warnings indicate a broken disk, they look a bit fishy to me. It looks the manufacturer used standard firmware from a regular hard drive and didn't adapt it to SSDs properly. For comparison, here are *all* values my Intel X25m reports via SMART: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0000 100 000 000 Old_age Offline - 0 4 Start_Stop_Count 0x0000 100 000 000 Old_age Offline - 0 5 Reallocated_Sector_Ct 0x0002 100 100 000 Old_age Always - 0 9 Power_On_Hours 0x0002 100 100 000 Old_age Always - 1703 12 Power_Cycle_Count 0x0002 100 100 000 Old_age Always - 509 192 Power-Off_Retract_Count 0x0002 100 100 000 Old_age Always - 445 232 Available_Reservd_Space 0x0003 100 100 010 Pre-fail Always - 0 233 Media_Wearout_Indicator 0x0002 097 097 000 Old_age Always - 0 225 Load_Cycle_Count 0x0000 200 200 000 Old_age Offline - 45681 226 Load-in_Time 0x0002 255 000 000 Old_age Always - 0 227 Torq-amp_Count 0x0002 000 000 000 Old_age Always - 0 228 Power-off_Retract_Count 0x0002 000 000 000 Old_age Always - 0 I think I would try finding and using the manufacturer's diagnostic tool (which might be a pain without Windows) and see what it reports. I would expect it to report no problems. J. -- I am no longer prepared to give you the benefit of the doubt. [Agree] [Disagree] <http://www.slowlydownward.com/NODATA/data_enter2.html>
From: Glenn English on 16 Apr 2010 10:10 On Apr 16, 2010, at 1:30 AM, Stan and Jochen and Camaleón wrote: > ~ "Don't worry about it" Thank you all very much. I'll just turn off smart; badblocks says it's OK too. FWIW, it seems to be a nice little 'drive' for a GUI-less router. It's only 30G, and the software's using way less than 10% of it. Fast, too... -- Glenn English ghe(a)slsware.com -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/BFA09D1F-244A-4897-B172-5B5E31B0B69E(a)slsware.com
From: Stan Hoeppner on 16 Apr 2010 20:10
Glenn English put forth on 4/16/2010 9:00 AM: > > On Apr 16, 2010, at 1:30 AM, Stan and Jochen and Camale�n wrote: > >> ~ "Don't worry about it" > > Thank you all very much. I'll just turn off smart; badblocks says it's OK too. > > FWIW, it seems to be a nice little 'drive' for a GUI-less router. It's only 30G, and the software's using way less than 10% of it. Fast, too... That's a perfect application for a small SSD. You probably could have even got by with an 8 Gig'er and saved some cash. -- Stan -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4BC8FBCD.5070001(a)hardwarefreak.com |