RAID? Was: PATA hard disks, anyone?

Paul Koning paulkoning at comcast.net
Wed Mar 28 19:11:13 CDT 2018


It's not quite that bad.  The answer is that the MTBF of four drives is probably not simply the MTBF of one drive divided by four.  If you have a good description of the probability of failure as a function of drive age (i.e., a picture of its particular "bathtub curve") you can then work out the corresponding curve for multiple drives.  I like to leave the details of how to do this to appropriate mathematicians.

If all you have is a data sheet that says "MTBF is 1M hours" then you don't have enough information.  You can assume some distribution and figure accordingly, but if the actual distribution is sufficiently different from the guess then the answers you calculated may be significantly off.

BTW, specified MTBF for modern drives is a whole lot higher than 100k hours.  Real MTBF may differ from specified, and derating the manufacturer's number according to your preferred level of pessimism is probably a good idea. 

	paul


> On Mar 28, 2018, at 9:57 PM, Richard Pope via cctalk <cctalk at classiccmp.org> wrote:
> 
> Fred,
>    I appreciate the explanation. So with out a 1,000, 10,000, or even 100,000 drives there is no way to know how long my drives in the RAID will last. All I know for sure is that I can lose anyone drive and the RAID can be rebuilt.
> GOD Bless and Thanks,
> rich!
> 
> On 3/28/2018 4:43 PM, Fred Cisin via cctalk wrote:
>> On Wed, 28 Mar 2018, Richard Pope via cctalk wrote:
>>>   I have been kind of following this thread. I have a question about MTBF. I have four HGST UltraStar Enterprise 2TB drives setup in a Hardware RAID 10 configuration. If the the MTBF is 100,000 Hrs for each drive does this mean that the total MTBF is 25,000 Hrs?
>> 
>> <pedantic sadistics>
>> Probably NOT.
>> It depends extremely heavily on the shape of the curve of failure times.
>> MEAN Time Before Failure, of course, means that for a large enough sample, half the drives fail before 100,000 hours, and half after.  Thus, at 100,000 hours, half are dead.
>> 
>> But, how evenly distributed are the failures? ...



More information about the cctalk mailing list