From: Glenn English on
SMART is warning a lot about a couple week old OCZ SSD. This is my first SSD, and I don't know if this is normal or I have a defective drive. Looks broken to me, but...

It's connected to an Asus AT3CG-I mobo.

Some output from webmin -- smart's writing in the logs quite a bit, too, about uncorrectable errors and such:

smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: OCZ-VERTEX
Serial Number: 26G39UFKZSO46VR1E334
Firmware Version: 1.5
User Capacity: 32,017,047,552 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Apr 15 20:02:18 2010 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x1d) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x00) Error logging NOT supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 0) minutes.
Extended self-test routine
recommended polling time: ( 0) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0000 007 000 000 Old_age Offline In_the_past 0
9 Power_On_Hours 0x0000 155 001 000 Old_age Offline - 0
12 Power_Cycle_Count 0x0000 012 000 000 Old_age Offline In_the_past 0
184 Unknown_Attribute 0x0000 001 000 000 Old_age Offline In_the_past 0
195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
198 Offline_Uncorrectable 0x0000 070 114 000 Old_age Offline - 1529
199 UDMA_CRC_Error_Count 0x0000 210 179 000 Old_age Offline - 609
200 Multi_Zone_Error_Rate 0x0000 096 078 000 Old_age Offline - 17
201 Soft_Read_Error_Rate 0x0000 193 151 000 Old_age Offline - 19
202 TA_Increase_Count 0x0000 103 105 000 Old_age Offline - 0
203 Run_Out_Cancel 0x0000 102 105 000 Old_age Offline - 0
204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
205 Shock_Rate_Write_Opern 0x0000 016 039 000 Old_age Offline - 0
206 Flying_Height 0x0000 001 000 000 Old_age Offline In_the_past 0
207 Spin_High_Current 0x0000 115 001 000 Old_age Offline - 0
208 Spin_Buzz 0x0000 115 000 000 Old_age Offline In_the_past 0
209 Offline_Seek_Performnce 0x0000 099 000 000 Old_age Offline In_the_past 0
211 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
212 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
213 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0

Warning: device does not support Error Logging
Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
SMART Error Log Version: 1
No Errors Logged

Warning! SMART Self-Test Log Structure error: invalid SMART checksum.
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

--
Glenn English
ghe(a)slsware.com




--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/7F8176EA-AA0A-4EC2-BEA1-F0441DE2141F(a)slsware.com
From: Stan Hoeppner on
Glenn English put forth on 4/15/2010 9:10 PM:
> 195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 211 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 212 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 213 Unknown_Attribute 0x0000 000 000 000 Old_age Offline FAILING_NOW 0

Note particularly the last three. Smartctl doesn't know what those are, but
it somehow knows they are failing? With a raw error count of zero, no less?

You don't have a bad SSD Glenn. Apparently the S.M.A.R.T data
structures/attributes for SSD have yet to be standardized as they have been
for mechanical disks. Many of the mechanical disk S.M.A.R.T attributes
don't exist for SSD and vice versa.

Newer versions of the Linux smart tools may fix this problem, or you could
hack up some file tables yourself to get your tools to understand that OCZ
SSD's S.M.A.R.T attributes.

Your SSD is fine. I think the Linux S.M.A.R.T tools aren't up to speed yet
WRT SSDs.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4BC7FFDA.8010108(a)hardwarefreak.com
From: Jochen Schulz on
Glenn English:
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x0000 007 000 000 Old_age Offline In_the_past 0
> 9 Power_On_Hours 0x0000 155 001 000 Old_age Offline - 0
> 12 Power_Cycle_Count 0x0000 012 000 000 Old_age Offline In_the_past 0
> 184 Unknown_Attribute 0x0000 001 000 000 Old_age Offline In_the_past 0
> 195 Hardware_ECC_Recovered 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 196 Reallocated_Event_Count 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 197 Current_Pending_Sector 0x0000 000 000 000 Old_age Offline FAILING_NOW 0

The question is whether these values have been like this from the start.
I think it is a little bit unusual for the drive to have a threshold of
zero for 195-197. And it makes me suspicious that not even attributes 9
and 12 are counted properly.

> 204 Shock_Count_Write_Opern 0x0000 000 000 000 Old_age Offline FAILING_NOW 0
> 206 Flying_Height 0x0000 001 000 000 Old_age Offline In_the_past 0

I think both of these values don't matter to SSDs at all. They shouldn't
care about shocks and don't have a head which could "fly high".

Conclusion: While the warnings indicate a broken disk, they look a bit
fishy to me. It looks the manufacturer used standard firmware from a
regular hard drive and didn't adapt it to SSDs properly. For comparison,
here are *all* values my Intel X25m reports via SMART:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0000 100 000 000 Old_age Offline - 0
4 Start_Stop_Count 0x0000 100 000 000 Old_age Offline - 0
5 Reallocated_Sector_Ct 0x0002 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0002 100 100 000 Old_age Always - 1703
12 Power_Cycle_Count 0x0002 100 100 000 Old_age Always - 509
192 Power-Off_Retract_Count 0x0002 100 100 000 Old_age Always - 445
232 Available_Reservd_Space 0x0003 100 100 010 Pre-fail Always - 0
233 Media_Wearout_Indicator 0x0002 097 097 000 Old_age Always - 0
225 Load_Cycle_Count 0x0000 200 200 000 Old_age Offline - 45681
226 Load-in_Time 0x0002 255 000 000 Old_age Always - 0
227 Torq-amp_Count 0x0002 000 000 000 Old_age Always - 0
228 Power-off_Retract_Count 0x0002 000 000 000 Old_age Always - 0

I think I would try finding and using the manufacturer's diagnostic tool
(which might be a pain without Windows) and see what it reports. I would
expect it to report no problems.


J.
--
I am no longer prepared to give you the benefit of the doubt.
[Agree] [Disagree]
<http://www.slowlydownward.com/NODATA/data_enter2.html>
From: Glenn English on

On Apr 16, 2010, at 1:30 AM, Stan and Jochen and Camaleón wrote:

> ~ "Don't worry about it"

Thank you all very much. I'll just turn off smart; badblocks says it's OK too.

FWIW, it seems to be a nice little 'drive' for a GUI-less router. It's only 30G, and the software's using way less than 10% of it. Fast, too...

--
Glenn English
ghe(a)slsware.com




--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/BFA09D1F-244A-4897-B172-5B5E31B0B69E(a)slsware.com
From: Stan Hoeppner on
Glenn English put forth on 4/16/2010 9:00 AM:
>
> On Apr 16, 2010, at 1:30 AM, Stan and Jochen and Camale�n wrote:
>
>> ~ "Don't worry about it"
>
> Thank you all very much. I'll just turn off smart; badblocks says it's OK too.
>
> FWIW, it seems to be a nice little 'drive' for a GUI-less router. It's only 30G, and the software's using way less than 10% of it. Fast, too...

That's a perfect application for a small SSD. You probably could have even
got by with an 8 Gig'er and saved some cash.

--
Stan


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4BC8FBCD.5070001(a)hardwarefreak.com