Exploring our way to the source of EMC's mighty VNX Nile
Mr Mellor, I presume?
Blocks and Files EMC, we have a problem. Your Project Nile has exabyte-plus capacity and is built from EMC's ViPR control/data plane software and VNX arrays. Yet, the biggest VNX is the 8000 at 4.5PB capacity, about 223 times too small. What gives?
EMC COO and president David Goulden said in Milan that Project Nile will use ViPR and VNX to provide file, block and object storage at web scale. Here's a transcript:
Nile includes technologies from ViPR. It includes technologies from the VNX family and we've packaged this together to give you a truly capacity-optimised elastic cloud storage system.
ViPR provides the file, block and object data interface to connected servers, with the underlying VNX being told what to do to by ViPR to deliver the storage capacity needed. The Nile prototype at Milan showed three identical racks with no obvious separate ESXi server for running ViPR as a virtual machine.
Jeremy Burton, EMC's chief marketeer, expanded on the web scale idea, saying Nile's capacity would be exabyte-plus.
EMC's Jeremy Burton with the Project Nile 3-rack prototype in Milan
The present largest VNX is the 8000 with, soon, EMC says, 1,500 x 3TB disk drives; a total of 4.5 petabytes. You would need 223 such VNX8000s to reach 1 exabyte of capacity (233 * 4.5PB = 1,003.5PB). Say we moved to 4TB drives; you would still need 167 systems (167 x (1,500 * 4TB) = 1,002PB) then. If a 1,500 * 3TB HDD VNX8000 has three racks, you would need (223 * 3) = 669 racks to reach the exabyte capacity level, an impressively large number.
VNX arrays cannot be clustered, and even if they were, a 223-node cluster with a v1.0 product is a hugely risky deal.
With this background we could use ideas suggesting how Project Nile can get 1 exabyte-plus of capacity from VNX arrays.
Effective VNX capacity increase ideas
One idea is that the VNX is basically a head and gets Atmos-style dense disk enclosures behind it. The G3-Dense-480 has 480 x 3TB disks in a 40U rack. Three racks of that gets us to 1,440 drives, 60 fewer than a 3-rack, 1,500 drive VNX8000. So we would need even denser drive enclosures.
A second idea is that the 1 exabyte-plus capacity is the effective capacity after deduplication. A big thing here is the your-mileage-may-vary issue because dedupe ratios are not guaranteed. Another is that EMC's execs would be being highly disingenuous if they were bragging about exabyte-plus capacity levels after deduplication. Everyone in their Milan audience understood they were talking about raw capacity and would surely feel deceived if that turned out not to be the case.
A third idea is that Nile could use ScaleIO, EMC's acquired technology to turn hundreds or thousands of servers' direct-attached storage into one massive virtual SAN. But Goulden said Nile would use ViPR and VNX, not ViPR and ScaleIO. Anyway ScaleIO is block storage whereas VNX is unified file and block storage, while Nile is file, object and block storage. This means ViPR only has to provide the object data service abstraction and translate that to VNX-speak. I think we can rule ScaleIO out of the equation.
Another idea is that ViPR provides 1 exabyte-plus of capacity by actually having 223 VNX800s, each with 1,500 x 3TB drives, behind it. They aren't clustered in a VNX software environment though. Instead ViPR aggregates them itself, and carves out storage from them to an enterprise's private cloud users. It's certainly scale-out; just add another VNX8000 if you need 4.5PB more capacity, but not in the Isilon - single Isilon environment - sense.
If this is the way it's done then ViPR would be acting as a federating super array controller.
A source thinks this is likely and will impose limits on the storage entity sizes supplied to Nile users:
[VNX] file system (FS) maximum size is ridiculous with 16TB limit per FS. Users must consider ... software glue to unify FS. [The] Nile goal is 1 exabyte but EMC doesn't say if it will be 1 Volume or 1 FS. So I imagine it's [going to be] external unification ...
For web scale, the Nile goal, users don't really aggregate storage as each object is addressed independently of the others. But if unified access is needed, especially file and object, a unification layer is needed ...
Users don't need massive block device (1EB block volume); unification is better.
It seems clear that the idea is for ViPR, a v1.0 product, to logically cluster hundreds of VNX arrays together. This seems almost fanciful and is hugely ambitious; hats off to EMC's big cojones. How else is EMC's Project Nile going to achieve a 1 exabyte-plus capacity using 4.5 petabyte VNX building blocks? ®