On Wed, Nov 18, 2020 at 6:24 PM Bill Gunshannon via cctalk <
cctalk at classiccmp.org> wrote:
On 11/18/20 8:16 PM, Warner Losh via cctalk wrote:
On Wed, Nov 18, 2020 at 6:00 PM Paul Koning via
cctalk <
cctalk at classiccmp.org> wrote:
>
>
>> On Nov 18, 2020, at 5:56 PM, Chuck Guzis via cctalk <
> cctalk at classiccmp.org> wrote:
>>
>> Tangential to this, I've long wondered about some things relating to
>> SSDs. Are there any solid figures on their retention period after
years
>> of being unpowered?
>>
>> The reason I ask is that I've long been in the habit of simply shelving
>> an old hard drive when I upgrade or replace a system. I've got hard
>> drives that still work that hail back to the days of OS/2 1.1; some
>> larger ones go back to the 1970s.
>
> You should be able to find the answer in the drive specs.
>
> As I understand it, there are two rather different ranges of answer
> depending on whether you're looking at an enterprise class drive, which
is
> optimized for high speed and large total
amount of data written, vs. a
> consumer drive. The power-off retention spec is much shorter for the
> enterprise drives. I forgot the numbers; I vaguely remember it being
less
than a
year.
For SSD devices, based on NAND Flash, the specs for retention are 90 days
for enterprise drives and 1 year for consumer drives, both at 20C. The
difference allows enterprise drives to trade retention for increased
write
rate.
> If the drive has power it will do something analogous to DRAM refresh to
> keep the bits in good shape. But it seems that the HDD rule that you
can
just set
a drive on the shelf for a decade (ditto with other magnetic
media) does not necessarily carry over to SSD.
Yes. NAND is just a bunch of small capacitors that decay over time. The
bit
error rate increases following the arrhenius law.
The ECC that goes along
with NAND is paired to allow NAND that's almost worn out to still retain
data for {3 months/1 year} given its expected bit error rate when it's
almost worn out when programmed, coupled with the expected decay during
the
specified retention time.
Also note I said "at 20C." The acceleration effect can be quite
pronounced
should the data center suffer some catastrophic
event that leaves it
without power in a super hot environment for weeks or months. At ~70C the
acceleration factor can be as high as 30-90x, which can render enterprise
drives not reliable after a few days baking at high temperatures.
Brand new NAND, on the other hand, typically has retention capabilities
measured in years or tens of years. It's the wear and tear of use that
makes it less reliable, often much less reliable. And the multi level per
cell technologies are much worse than the single level per cell. It's one
reason that the smaller number of bits per cell NAND tends to last longer
than larger bits per cell, all other things being equal. The smaller
process sizes also were less reliable since they could store fewer
electrons (sometimes as few as a dozen or two per state). 3D NAND was so
much better because it could grow vertically, allowing NAND manufacturers
to return to larger process sizes and still increase density, also giving
better endurance for a time...
An interesting write-up. Brings up a question on a slightly related
item. Do Compact Flash and SD have the same short life when not
powered? What things like Flash Memory used to hold firmware on
other kinds of chips.
Yes. They do. It's all the same NAND. However, as I said, freshly made NAND
tends to have very long retention times, so for those use cases, the
application is fine. Unless you are doing a lot of writing to the drive
where the firmware is on, you'll see good results. And a lot usually means
on the order or rewriting the drive every day. Almost all CF (larger than
around 16MB) and SD cards (larger than about 32MB) have wear leveling as
well, which periodically moves the cold OS data around to even out the P/E
cycles the erase blocks across the device. Also, most of the time, the
firmware is in devices that's powered on, so it will move the data should
it decay too much, even when there's not a lot of traffic to the drive
(especially, that's when the FTL loves to do its housekeeping).
'writing often' for NAND in this context is usually measured in 'several
times per day', though with QLC drives, this can be as little as 0.3 or
0.1. The drive writes per day (DWPD) is a spec sheet item these days, and
different levels of drive have differing values clustering around 0.3, 1, 3
and higher...
Warner