Google on hard drive failures

23 Feb 2007

I worked in the disk industry for several years (Fujitsu, Maxtor, and on
the silicon side at Cirrus Logic.) and Google's findings are not
unexpected.
One of the major disk drive manufacturers (Quantum?) discovered a problem
with their drives on Unix systems. Since Unix doesn't tend to make
spurious reads and writes while sitting idle (Windows does it a LOT), the
heads would sit in the same place for long periods of time, eventually
pushing the lubricant aside, resulting in a head crash. The manufacturer
countered this problem by running a butterfly seek every few minutes to
level out the lubricant.
In any event, spinup/spindown events are hard on the head/disk interface,
and temperature cycling causes expansion/contraction which stresses
marginal solder joints.
I would expect very long lifetimes from hard drives.
Most modern drives use fluid bearings for the spindle so, aside from
degradation of the oil, they will last forever. Major IC manufacturers
have a good handle on electromigration issues, so the IC's are unlikely to
fail.
Drive firmware has improved as well. Most drives will detect a block going
bad (high correctable error rate) and move the data to a spare block.
Unless you issue special commands to the drive you don't even notice the
error and relocation occurred. It is a good idea to read everything on the
drive occasionally so bad spots can be fixed before they corrupt the data.
Enough rambling,
Clint
On Wed, 21 Feb 2007, Tim  Shoppa wrote:
...
  Google has a very vast herd of machines with a
 large number of hard drives. Very fruitful
 that they analyze failures and publish
 the results to the web!
 While the drives they are studying are definitely
 not classic (all dating from 2001 or later),
 those of us who host large quantities of classic
 material may find the results of interest:
  http://labs.google.com/papers/disk_failures.pdf
 Side note: at one point I found it unbelievable
 that Google was using consumer-grade hardware
 to host their stuff. Since then, I've developed a
 lot of respect for this approach!
 Tim.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Google on hard drive failures