Supercomputer flash kings: TLC needs, er, TLC
Three-layer cell NAND, the iron bicycle of memory
ISC12 Here at Hamburg's supercomputer fest, three merchants of flash were plying their wares. What did they think about the chances of 3-layer cell (TLC) NAND, the stuff that's cheaper than MLC but slower and with a drastically shorter working life? Cue shaking of heads and whole stack engagement.
The problem is well-known. MLC flash with its 2-bits per cell has a working life of around 5,000 program/erase (PE) cycles, which clever controllers can triple, quadruple or even better. TLC flash has a P/E rating of 800-1,000 cycles. This is so low as to make it useless for enterprise use unless controllers can lift it better than the raw MLC rating. We' d be looking at a minimum, say, of 8,000 PE cycles and hopefully much better.
Startup DensBits is developing a TLC flash controller that can achieve 10,000 cycles. Only OCZ is making positive noises about TLC product for enterprise use but there is no commercially available TLC controller for enterprise TLC use, or is there? GreenBytes has a TLC flash layer in its Solidarity product but isn't saying where it comes from.
If no one is making it, where did GreenBytes get theirs?
El Reg's storage desk spoke to to Micron, Samsung and STEC at ISC in Hamburg was to see if they might be involved in the GreenBytes product.
A Micron spokesperson said that there was a 64GB P84 TLC chip ready for production, P84 being the internal name, but it was for USB sticks and MicroSD cards, but "will not be used in SSDs [as] it's a consumer product." It's rated at around 2,500 to 3,000 PE cycles. No joy there then.
Gaetano Pastore, STEC's technical sales manager for EMEA, said it was very difficult for TLC to be used in the enterprise environment because its latency is extremely high and its endurance "barely 1,000 cycles". Its CellCare technology boosts MLC endurance from 5,000 to 60,000 cycles. Surely if that 12X endurance improvement was applied to TLC, we'd see some results?
It's not that easy, he responded: "CellCare couldn't necessarily achieve a 12X endurance increase with TLC and data retention could be an issue."
TLC flash doesn't reliably keep its contents readable for three months after the write life is exhausted. Another colleague said there was a read-disturb effect with TLC, where bits in the cells could be moved by read accesses leading to more writes being necessary.
There was no sense of STEC being enthusiastic about TLC. Instead the firm is focused on making consumer-grade MLC flash as long-lived and reliable as enterprise-grade MLC with its CellCare technology. Still, thinking intuitively, CellCare could deliver a boost to TLC endurance. The question is, how much? Could it do a 10X improvement and reach the DensBits level of 10,000 cycles?
Only so much a TLC controller can do...
What about Samsung? Duc Nguyen, associate director at Samsung Semiconductor Europe, also downplayed TLC's chances in the enterprise, saying that by the time the writes reached a TLC SSD it might already be too late.
What he meant was that there was only so much a TLC controller could do to boost TLC's PE cycles and reduce the number of writes, and that might not be sufficient. Nguyen said; "The whole stack has to be involved in minimising TLC writes," meaning the host operating system, the application, the file system, and then the controllers and then the TLC NAND.
He offered this thought, though: "SSD in the future will be very application-dependent. It won't be generic."
An email exchange with John Scaramuzzo, president at SMART Storage Systems, about TLC, was positive about TLC being useful with entry-level servers.
El Reg: Can SMART comment on the general issues around bringing TLC SSDs to market?
John Scaramuzzo: "Clearly, the biggest challenge for TLC will be achieving endurance levels on par with cMLC, as this will require a 10x jump in native endurance. This is where SMART Storage Systems has a distinct advantage ... Beyond that, the additional complexity of managing the 8 signal levels of TLC vs the 4 signal levels of MLC will have an impact on performance, and require a significant amount of testing to ensure the quality required in the enterprise environment."
El Reg: With SMART Storage Systems able to get 25 full drives per day from its Optimus Ultra MLC solid state drives it seems to me that it could get, possibly, 10 full writes a day for five years from a TLC (3-bits per cell) SSD. What does SMART think about that?
John Scaramuzzo: "The entry server "Read Mostly" segment of the market currently uses consumer MLC rated for 3-5k PE cycles. This translates to 0.4 - 0.7 drive writes per day (DWPD). TLC has much lower native endurance, likely around 500 PE cycles. With SMART Storage Systems' Guardian Technology Platform, we can increase the endurance of TLC 7 - 10x higher, essentially creating parity with cMLC at 3-5,000 PE cycles, or 0.4 - 0.7 DWPD.
"This will allow SMART Storage Systems to deliver entry server SSD's with TLC that have the same endurance as today's cMLC-based devices. To achieve greater endurance than this with TLC would require a significant amount of over-provisioning. Since TLC currently only offers a 15 to 25 per cent cost savings over cMLC, we don't think it will be practical to push TLC beyond 1 DWPD as it would eat into the cost benefits.
"Therefore, we believe TLC will take over from cMLC at the entry server level, while cMLC will extend into every other tier of the data centre, replacing eMLC and SLC to drive down costs for enterprise organisations."
Clearly TLC is on the threshold of entering enterprise storage use in arrays and in servers. Someone has an enterprise TLC SSD controller, because GreenBytes is using it. Who that is, is a mystery.
El Reg reckons TLC enterprise-class product will appear more and more over the next 12 months. All it needs is tender loving care from the software stack using it, and controllers that can wrest 10,000 PE cycles or more out of the raw TLC NAND. ®
Sponsored: Benefits from the lessons learned in HPC