The incredible shrinking NAND: I'm MEELLLLTING
Flash array bubbles burst
Blocks and Files NAND is heading to the graveyard, getting closer and closer with every geometry shrink and every added cell bit. Any replacement NV-RAM technology will require controller software rip-and-replace, which could kill one trick pony flash array startups.
NAND flash is non-volatile but expensive to make and both ways of making it more affordable are leading into technology dead-ends. The first way is to shrink the process geometry and so get more flash dies from a wafer, lowering the cost per die. But process shrinks cause the flash's working life, in terms of program/erase (PE) cycles, the number of times a flash cell can be written to, to decline.
It's generally thought that a 59-50nm (5Xnm) process size gives us 10,000 2bit multi-level cell (MLC) PE cycles. A 3Xnm (39-30nm) process gives us 5,000 and a 2Xnm cell gives us 3,000. A 1Xnm cell will provide a derisory endurance rating without greatly over-provisioning the NAND product with extra cells ready to see when the first ones wear out.
NAND controller software can work to overcome this trend, and has already done so – using techniques to reduce the number of writes, get better at treading data from cells that are wearing out, and over-provisioning. The problem is that this usually only takes place as the problems are mounting up.
And they are mounting up, because the second affordability approach is to add extra bits to the basic 1-bit flash cell – single level cell (SLC) flash. MLC is 2 bits per cell. TLC is 3 bits per cell.
With a 3Xnm process: SLC NAND can do 10,000 PE cycles; MLC can do 5,000 cycles; and TLC can do 1,250. It's estimated 2Xnm TLC can do 750 cycles. Imagine the limited and shorter endurance of 1Xnm TLC NAND; are we looking at sub-500 cycles?
We can say with certainty that we will not see 4-bits per cell NAND and that we might not see – in fact probably won't – NAND process geometry going below 10nm, or even below 15nm. It's a dead-end game.
This is known and generally understood and the timescale is such that 2Xnm TLC will enter enterprise storage use this year and take us through to, say, 2014. After that, 1Xnm TLC may be feasible and sub-1Xnm TLC most likely won't happen. So what will happen, because the need for higher-capacity, longer-lived and more affordable non-volatile memory won't go away.
There are several post-NAND technologies jostling for prominence, such as Phase-change memory, resistive RAM, memristors and IBM's Racetrack memory. All promise greater capacity, higher speed and longer endurance than flash. It's not clear which one of them will become the non-volatile memory follow-on from NAND, but, whichever it is, the controller software crafter to cope with NAND inadequacies won't be needed.
MLC NAND wear-levelling and write amplification reduction technology won't be needed. The NAND signal processing may be irrelevant. Garbage collection could be completely different. Entire code stacks will need to be re-written. All the flash array and hybrid flash/disk startups will find their software IP devalued and their business models at risk from post-NAND startup's IP with products offering longer life and faster-performance.
In the worst case, NAND storage start-ups will find their competitive advantage is no longer sustainable and they will fail. The flash SSD controller companies and controller software-owning companies will need to write fresh code stacks if they enter the post-NAND NV-RAM (non-volatile RAM) product space. Suddenly everyone in the NAND controller software business is plonked back down at the starting tape, all starting afresh.
A venture capitalist and a long-term investor looking at this picture and agreeing with it, would say the flash and hybrid flash/disk storage startups have no long-term future with their technology and are stuck on fast-forward into a dead-end. Unless their firms are acquired, the investors behind them will probably not get the return they want and could lose their cash. The smarter ones behind the flash and hybrid flash/disk startups already knew this. Acquisition is the only realistic exit strategy.
Do the potential acquirers know this too? Do they appreciate that sky-high valuations of flash array and hybrid flash/disk array start-ups are short-term and unsustainable? Billion-dollar plus asking prices for buying flash array start-ups could simply not get paid. Expectations have got to come down to earth. We could be in a flash array bubble and it's about to burst because NAND's limitations are becoming ever clearer.
Enterprise flash, in the longer scheme of things, could be over in a flash. ®
Flash DOES have minimum size limits
Just like previous non-volatile memory systems ferrite core and plated wire, there is a minimum size limit for the cells in flash memory. (In this case as smaller cells have an unacceptably low write endurance.)
The competitor designs for NVRAM (eg phase change) do not reach their minimum size point until much smaller than the minimum size for a usable flash memory. When their production cost ($/GB) reduces below the cost of flash memory then the industry will move to the newer technology.
Users will still see SSDs with the same external interface (SATA 2) so at the user level the change will be invisible except for longer lifetimes from the newer SSDs.
As the newer technologies do not need wear leveling or write amplification minimisation, the complex flash controllers such as Sandforce will no longer be needed and much simpler controllers can be used.
It is firms like Sandforce and Indilinx that will suffer a revenue hit with the new technology, most of the flash ecosystem will be unaffected.
doomed, we are all doomed
A couple of things spring to mind:
Technologies only die if there is no longer a use for them or when they are replaced by something better that is available at the same price. The article mentions the "promise" of a handful of new memory technologies but gives no real insight into when they will be available and what their specs will be (capacity, $/GB, speed, lifetime, ...). Consequently, predictions about the imminent death of all NAND NV memory products (and the subsequent extinction of the companies that make them) seem somewhat premature.
Will "whole software stacks" need to be rewritten? Really? The idea behind a software stack is that functionality is partitioned with the result that changes in one layer (say garbage collection or wear levelling) do not result in the need for a complete rewrite. Even if drivers, file systems and the like do need to be rewritten for the new technologies, why does the author seem to think that this will be impossibly costly or onerous? If it is hard to do, that fact would seem to me to prolong the life of the incumbent NAND NV control software rather than bring about its demise.
If the author has figured all this out, why does he assume that investors and potential acquirers will not be able to do the same? It all sounds like basic due diligence to me.
Larger flash cells are easier to make, can be fabbed on more lines, have more competition (not to mention have far higher durability and are faster than small flash cells)
This will push development into a few areas.
- alternatives to NAND for higher density
- chip stacking (already being done anyway)
- "cheap as chips" lower-capacity devices.
The main disadvantage of using stacked chips is heat/power consumption. Outside of laptop and HPC environments that might be an acceptable tradeoff.
I certainly wouldn't write off the startups. Most of them don't fab their own silicon, so if the technology changes they'll simply change the way they build things and keep going.
I also wouldn't write off NAND. It's been available commercially for ~30 years and there's a lot more development which can be done yet. (One thing which springs to mind is low power DRAM SSD with supercaps and a flush-to-nand routine when the power goes off, in order to mitigate the durability issues. It doesn't matter how the technology works as long as it looks like NV storage to the operating system)