Hard-drive diagnostic program

6 Sep 2012

On Wed, 5 Sep 2012, Keith Monahan wrote:
...
  On 9/5/2012 2:51 AM, Tothwolf wrote:
  On Tue, 4 Sep 2012, Hollandia at ccountry.net
wrote:

  Will someone name a program that will do
"checkup" on a hard drive,
 that could warn of an impending failure?

 Thanks,

 Kurt  
 The first person who comes up with a way to reliably predict drive
 failure would become an overnight billionaire. 
 I've just recently had SMART catch an impending failure. 
...
  Drives have what are called "pre-failure"
attributes, and if the values of 
 those attributes exceed the threshold, then the drive is considered to be 
 failing. The drive manufacturer will (generally) honor the warranty if THEIR 
 smart utilities confirm the failure.

 A 1TB seagate (7200.11) failed with a Reallocated Sector Count(4153!!).  It 
 was also indicating some Offline Uncorrectables. Seagate's utility offered up 
 a defect verification code (or whatever it is called) and off to seagate. 
 They replaced it, although their rma process was SLOW SLOW SLOW.

 I managed to copy all the data off successfully, but it started making some 
 physical noises during the copy --- further confirming(to me anyways) that it 
 was on it's way out.

 SMART isn't perfect, and is definitely not a replacement for good backups, 
 but it's better than nothing. 
SMART is better than nothing, although it too isn't all that reliable of a 
metric. I've had drives up and die without so much as triggering a SMART 
warning and I have drives with a great many reallocated sectors still 
plugging away, some with now well over 100,000 power on hours (with the 
data they contain backed up, of course).

On the other hand, far too many people think SMART is a _reliable_ way to 
predict drive failure and try to depend on it to "warn" them /before/ 
their drive fails. This /eventually/ leads to disastrous results since 
they usually don't bother to back up their files. Of course these tend to 
be the same types of people who actually think RAID is a valid alternative 
to keeping backups... To wit: multiple drive failure.

One of the key things I've discovered with higher quality drives is that 
the lower the number of head retractions (and spin down cycles) the longer 
the drive seems to last. I initially discovered this purely by accident, 
and this is something you /never/ want to see:

=== START OF READ SMART DATA SECTION ===
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED
RAW_VALUE

[snip]

   9 Power_On_Hours          0x0012   053   053   000    Old_age   Always       -      
20827

193 Load_Cycle_Count        0x0012   001   001   000    Old_age   Always       -      
2131639

~2.1 million head load/unload cycles...

2131639 load/unload cycles / 20827 power on hours = 102.35 cycles per hour
102.35 cycles per hour / 60 minutes = 1.71 cycles per minute

More background: http://tothwolf.livejournal.com/35252.html

Ultimately Linux itself wasn't the cause, the hard drive itself just 
defaulted to a very very dumb power management mode. The default power 
management mode might not have been as bad with a fat32 or vfat 
filesystem, but filesystems such as ext2/ext3 constantly want to update 
atime, so with my drive it turned out the heads would retract/reload 
roughly 1.71 times per minute.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Hard-drive diagnostic program