RAID? Was: PATA hard disks, anyone?
Richard Pope
mechanic_2 at charter.net
Wed Mar 28 20:57:58 CDT 2018
Fred,
I appreciate the explanation. So with out a 1,000, 10,000, or even
100,000 drives there is no way to know how long my drives in the RAID
will last. All I know for sure is that I can lose anyone drive and the
RAID can be rebuilt.
GOD Bless and Thanks,
rich!
On 3/28/2018 4:43 PM, Fred Cisin via cctalk wrote:
> On Wed, 28 Mar 2018, Richard Pope via cctalk wrote:
>> I have been kind of following this thread. I have a question about
>> MTBF. I have four HGST UltraStar Enterprise 2TB drives setup in a
>> Hardware RAID 10 configuration. If the the MTBF is 100,000 Hrs for
>> each drive does this mean that the total MTBF is 25,000 Hrs?
>
> <pedantic sadistics>
> Probably NOT.
> It depends extremely heavily on the shape of the curve of failure times.
> MEAN Time Before Failure, of course, means that for a large enough
> sample, half the drives fail before 100,000 hours, and half after.
> Thus, at 100,000 hours, half are dead.
>
> But, how evenly distributed are the failures?
> Besides the MTBF, it would help to know the variance or standard
> deviation.
> It is unlikely that the failures follow a "normal distribution" (or
> "Laplace-Gauss") bell curve. And, other distributions are certainly
> not ABnormal :-)
>
> If the curve is symmetrical, then the mean, median, and mode will all
> be the same. If it is not symmetrical, then they won't be. Hence the
> use of MEDIAN - at that point half are dead, half are still alive.
> In toxicology, there is a concept of an LD-50 dosage - the dosage that
> will kill half, since for example, antibiotic resistant bacteria might
> require an incredibly large dosage to get that last one, but LD-50
> provides a convenient way to get a single number.
> 100,000 hours is the LD-50 of those drives.
>
>
> If it turns out that the drives last 100,000 hours, plus or minus 10%,
> then you have a curve with a very steep slope. It is still half dead
> at 100,000, but maybe hardly any dead until 90,000, hardly any left
> alive at 110,000.
>
> OTOH, if the failures were evenly distributed throughout a life of 0
> to 200,000 hours, with the same number going every day, then that also
> would have a MTBF of 100,000. In THAT case, then yes, the MTBF of
> first failure may well be 25,000.
>
>
> They rarely work that way. Often our devices will have what is
> sometimes called a "bathtub curve". There are a few failures
> IMMEDIATELY ("infant mortality") falling off rapidly, and then few
> failures for quite a while, and then, as random parts start to wear
> out, the failures rise. In fact, with the same MTBF of 100,000, it
> could be that once the early demise ones are discarded, that the MTBF
> of the REMAINDER might be 200,000.
>
> IFF you are willing to deal with the DOA and infant mortality cases,
> then by discarding or ignoring those outlying numbers, you might get a
> more realistic evaluation of what to expect.
> </pedantic sadistics>
>
> --
> Grumpy Ol' Fred cisin at xenosoft.com
>
More information about the cctalk
mailing list