DRAM, bam, thank you ma'am: How XtremIO gets its speed
Memory turbo-charges data access
EMC’s XtremIO array has been launched into general availability today. Its details are as we said here and the storing of metadata in memory is key to understanding its speed.
Data is stored in 4K blocks with each block having a unique hash address. There are two levels of addressing in the metadata. First an incoming data access contains a logical block address (LBA). A C-module in the controller converts this to a hash. A D-module then converts the hash to a physical location in the SSDs attached to the controller.
By having this metadata stored in main memory, shared in-memory metadata, it is not written to the SSDs and this reduces the overall write burden on the component flash and helps to extend its life. We understand that metadata is journaled across the X-Bricks.
The metadata-in-memory design helps add speed as well, since memory accesses are far faster than NAND flash accesses;
- Virtualised server virtual machine (VM) cloning is virtually instantaneous - no actual data is moved as it’s just the collection of a set of pointers.
- Cloning databases for test and dev is very fast - Delphix database virtualisation is not needed for test and dev database copy production.
- Snapshots are fast.
- Deduplication is done before any data is written and that makes dedupe fast and extends the component SSD’s endurance by reducing data writes.
Another contributor to XtremIO speed is that the system controllers do not do garbage collection; the reclaimation of deleted cells. This is done inside each SSD by its ASIC controller.
Ex-GreenBytes CEO Steve O’Donnell, who is also on the advisory board at Violin Memory, said: "EMC is being disingenuous about the garbage collection. The idea is that data is managed in 4k blocks but NAND pages are 64k. One can only write to those bit of pages that have not been written to before. Garbage collection is the process of tidying up blocks that need to be rewritten.
"EMC have just offloaded the garbage collection to the SSD (where they have no control of when the NAND locks up, rather than at the system controller level where they can manage it properly. Another dumb design."
As an example of XtremIO speed EMC says that 1,000 linked VDI clones can start up in 15 minutes.
XtremIO systems have no quality of service (QOS) features or levels for a simple reason; everything is consistently fast.
Because of the memory metadata store, the X-bricks have to have uninterruptible power supplies (USB) in case a controller goes down. We understand there is an internal SSD in an X-Brick which is used for a metadata dump during shutdown.
O'Donnell said that controller (server) motherboards are notoriously unreliable and storing metadata in memory is a dumb idea. But it does make the XtremIO array fast. ®