Keep up with the fast-moving world of all-flash array storage

How to pick the right kind

Intel and Micron's 20nm, 128Gb NAND chip

An all-flash networked array can do wonders in speeding up data accesses by applications running in connected servers.

It is obviously best if the all-flash array is seamlessly integrated with your existing networked storage infrastructure. Suppliers of new integrated flash arrays say that getting the best use out of flash requires new system architecture outside of legacy storage arrays.

They say that the performance of such new-design flash arrays makes their purchase worthwhile, and over time data management facilities will mature.

Are they right?

Let’s list four types of array to lay down the groundwork:

  • All-disk array
  • Hybrid disk and flash array
  • All-flash array inside disk array infrastructure
  • All-flash array outside disk array infrastructure

An existing networked disk array, SAN or filer or both, needs to be deployed, operated and managed and to have its data protected. There will be service routines in place for array management and data protection.

Processes such as backup, snapshot, replication, archiving and disaster recovery, often involving separate hardware and software products, will form an operating environment for the array.

There will be a networking infrastructure to connect it to servers, such as Fibre Channel, Ethernet (ISCSI and files) or InfiniBand. Admin staff will have been trained and are familiar with the array’s operation, as well as with setting up application accesses to the array’s data.

If a subsequent array of the same type is obtained, then the same staff, routines, connectivity methods and ancillary data management service products can be used, easing the burden of deploying and operating it.

No more delays

Existing disk-based arrays suffer from latency, or delays in accessing data. This doesn’t matter too much when large files with streaming access are involved, but it does matter with random access to small files, and has begun to matter more as servers become more powerful through virtualisation and multi-core CPUs.

Virtualising servers and threading application software makes the servers much better at keeping their cores busy processing application code than the old inefficient multi-tasking operating systems such as Windows NT and Unix. The applications run in virtual machines managed by a hypervisor.

The impact of array latency on applications running in virtual machines has been exacerbated by servers getting multiple sockets so they can have more than one CPU, and by CPUs getting more cores so they can run more applications or application threads at the same time.

Effectively, a two-socket, 12-core server has 24 cores running in parallel, each one nominally equivalent to a previous single-socket, single-core server, and imposing a storage array access burden that is roughly 24 times greater overall.

What is the effect of disk latency on CPUs waiting for data to come in from an array? CPUs access data from main memory in nanoseconds. It takes microseconds to get data to or off NAND flash, and milliseconds to get data onto or off a disk array.

Supporters of flash arrays say server CPU cores will get data a hundred or more times faster

A nanosecond is a billionth of a second. A microsecond is a millionth of a second and a millisecond is a thousandth of a second. In the time it takes for a disk array access, say 7 milliseconds, a waiting CPU core could execute 7 million instructions. If we characterise a CPU memory lookup as taking a minute, then a disk array access could take eight years or more.

Supporters of flash arrays say that server CPU cores will get data a hundred or more times faster from their flash chips than from a disk array’s platters, and that means your servers can do more: support more virtual machines, process applications faster, respond to online queries faster and so on.

But startups' arrays operate outside the existing storage array environment, so they cannot be operated and managed in the same way. Nor can they have their data protected in the same way, both of these implying they are more burdensome to manage. They may also have network connectivity limitations such as no InfiniBand support, for example.

Flash as an extra

Why not simply put SSDs in the disk-drive slots of disk-based arrays and so inherit all the operational and data management procedures, processes and products?

All mainstream disk array suppliers have done this – Dell, EMC, Fujitsu, HDS, HP, IBM and NetApp.

Putting SSDs in some of the array’s drive slots provides a hybrid disk and flash store. The most active data can be kept on flash, with less active older data stored on disk. Automated routines can move ageing data off the flash and make room for newer data.

Another mainstream approach is to build an all-flash array separate from the disk arrays but existing in the same environment.

For example, Fujitsu’s DX200F is an Eternus system like disk-based Eternus arrays and it can be managed together with all the other Eternus DX systems through a central management console.

It comes with thin provisioning like other Eternus arrays and its contents can be backed up to a near line disk-based Eternus using a remote copy feature. It also costs less than flash arrays from suppliers such as Pure Storage.

HP’s StoreServ 7450 is another example of such an in-architecture flash array.

Flash arrays inside an existing architecture also inherit that architecture’s reliability and resilience, for example with dual controllers in case one controller fail and redundant power and cooling equipment to guard against component failure.

Newer arrays, such as NetApp’s FlashRay, may have only one controller in their first release, and thus may not be well suited to enterprise mission-critical workloads.

A longer life

Some array suppliers have gone further and extended their array hardware to provide, as it were, a logically all-flash array within their existing array architecture.

Examples include HDS’s Accelerated Flash modules for its VSP array. We can view this as a hybrid array, which the manufacturer says provides various advantages over the more basic alternative of having SSDs in disk-drive bays.

Startups such as Pure Storage, SolidFire and Violin Memory say that the flash, with its limited endurance, is not used efficiently and will wear out quicker than in their systems, which work to limit the number of life-shortening writes to the flash.

Secondly, their brand new operating systems are not based on disk access processes so they route data access requests to and from the flash in their arrays faster than a legacy disk-drive array would.

The flash wear-out point is weakened when companies such as Pure, for example, use SSDs as well. Commodity SSDs have controllers that minimise writes and manage the unit’s working life.

Typically, they have a warranted life of five years. Also the array software and hardware may well use technologies that reduce the amount of data written.

Some new-build flash arrays come from mainstream suppliers. Cisco acquired Whiptail and now supplies Invicta all-flash arrays using Whiptail technology. However, these do not feature in the VCE converged Vblock systems, using Cisco UCS servers and Nexus switches, VMware’s hypervisor and EMC storage.

There is an all-flash Vblock but it uses integrated XtremIO flash arrays – technology that EMC acquired by buying the XtremIO startup business.

EMC customers can include XtremIO arrays with EMC's more traditional VMAX and VNX arrays under the ViPR storage software abstraction layer, but they are not integrated as such into either the VNX or VMAX environments.

NetApp has built its own all-flash array, called FlashRay, and says general availability is likely in the next few months. It exists outside the NetApp FAS ONTAP environment, but it interoperates with it and the level of integration is set to increase.

Violin Memory’s all-flash arrays don’t use SSDs. Violin says the arrays' hardware and operating software have been designed and built together to provide higher performance.

Flash array types

We might generally rank flash array performance in these terms:

  • All-disk array, basic level
  • Hybrid disk and flash array, faster than disk only
  • All-flash array inside disk array infrastructure, faster still with shared data management services
  • New-build all-flash array outside disk array infrastructure, fastest of all but lacking shared data management services

Roughly speaking the cost/GB of storage increases as you move up this list.

There is no single answer to all shared array needs but we can categorise four groups of use cases associated with these performance and cost levels.

1. Bulk data vaults sized in the petabyte or above class, with access from applications that are not latency sensitive; typically this is for data that is not mission critical, near-line data storage and primary data. A mainstream traditional array is generally fine for this.

2. Primary data in petabyte and petabyte-plus amounts with subsets requiring fast access: for example, a month’s worth of customer records or event data may be needed for an analysis run. A hybrid array can effectively cache this data in flash and serve it quickly from there while other data accesses go to and from disks.

3. Primary data in the tens to hundreds of terabytes area with a need for consistently fast access: VDI might be an application needing this kind of storage. Such a data pool can be managed and operated inside the overall storage array data management environment.

4. Primary data in the tens to hundreds of terabytes area with a consistent need for lightning fast access: applications such as financial trading and mass online sales rely on having the fastest possible access to largish data sets and the cost saving or revenue-raising effects are good enough to justify the cost.

The density of flash is on a steady upward trend and its cost per-GB is falling. Today we think in terms of flash being affordable for applications needing fast access to hundreds of terabytes of data but not tens of petabytes or more. In a year's time, with denser and less expensive flash, the boundary will have moved up the capacity scale.

Flash array technology is changing networked array systems for the good and things can only get better. ®

Sponsored: What next after Netezza?


Biting the hand that feeds IT © 1998–2019