Filer upstart Qumulo rubs out mirroring, slots in erasure coding
Scale-out NAS-er gets usable capacity gain. Neat
Scale-out filer startup Qumulo has replaced mirroring with erasure coding in v2.0 of its Core OS to deliver a 33 per cent gain in usable capacity.
It's also added real-time analytics, adopted HGST helium-filled disk drive technology and has three new appliances using HGST drives.
Qumulo delivers its technology as software-only on third-party hardware, meaning HPE, and as combined hardware/software appliances.
The Core OS provides storage through QSFS (Qumulo Scalable File System) which is built on top of an object store block layer, with files split into chunks of data for storage and placed on multiple drives. The system can scale out to store billions of files, both small and large. Protection (erasure coding) is carried out on chunks and not at file-level.
Reed-Solomon erasure coding offers +4 and +5 protection with +2 overhead. It entails less overhead than mirroring and provides greater headroom as capacity scales out. A QC24 appliance with 240 TB of raw capacity has 114 TB usable after mirroring. With erasure coding used instead, the usable capacity jumps to 144 TB.
Core OS can withstand two concurrent drive failures or a complete node collapse, with no data loss.
If a drive fails then its contents are rebuilt by nodes across the cluster using a logical spare concept. We get parallel rebuilding across the cluster nodes, and a failed 10TB drive can have its contents rebuilt in under an hour, which is remarkable by RAID rebuild standards.
In Qumulo's view, 10TB drives obsolete RAID because their rebuild times are so long.
Qumulo ships a tool that provides predictive analytics on the Qumulo nodes' performance, such as capacity trends. We're told it's an open source tool and will be built into the user interface in a forthcoming release.
The real-time analytics are built into the file system and can tell admins where data is stored, which users or applications are accessing what files, what should be archived, backed up or deleted, and why data grows. In a little more detail it provides;
- Immediate answers to how storage capacity usage has changed in the past 72 hours, 30 days, and 52 weeks
- Users can drill down to understand which files have been added and deleted by path
- Users can answer questions auch as “Where did all of my storage capacity go?” and “When will I run out of storage space and why?”
Here's a sample screenshot:
The screenshot above shows a capacity usage history over 30 days with the green bars showing per-day changes and the white bar being a highlighted per-day change. The horizontal drill-down bar chart below shows the individual changes that day with the access paths identified so admin staff can see which accessing hosts are responsible for array capacity usage.
There is also a remote monitoring system which uses lighter weight telemetry. Qumulo runs analytics against that data in the cloud.
Qumulo has a cluster design with an appliance hybrid disk/flash product range: the 1U QC24 (1.6TB flash/24TB disk) and the 4U QC208 (2.6TB flash/208TB disk.)
Three new appliances have been added;
- 1U QC40 with 10TB helium drives and 40TB raw sturagw/node
- 4U QC104 - a mid-range capacity platform with 10tB capacity/node
- 4U QC260 with 10TB helium drives and 260TB raw capacity per node
A four-node QC260 cluster has 1.04 PB of raw capacity. The company says it is the first storage array supplier to deliver product using HGST's 10 TB helium drives, and uses the 6TB, 8TB and the 10TB models.
It doesn't have religion about media, saying that, today, its flash/HDD hybrid design is the most cost-effective and best performing design choice. Were that to change then it could provide all-flash designs in the future, where performance is critical and cost is not too high.
Its engineers are keeping a watch on NVMe technologies.
Customers buy a subscription license, which includes service and support. Most choose the three year option, with some going for one year and some for the five year licence. Customers can change their hardware under the terms of the license. The price is based on a $/TB capacity per year scheme.
Qumulo has more than 50 paying customers and mote than 40 PB deployed, with one customer past the 4PB level. It's selling into markets such as earth sciences, cable/telco/satellite and, in a surprise to its execs, connected cars. The oil and gas market is not panning out as the oil glut induced by Saudi Arabia has caused lower oil prices and a pattern of reduced purchases by businesses operating in the sector.
Later this year Qumulo will expand its third-party hardware compatibility list. Currently it supports its software running in virtual machines for testing. In the future it will support their production use, both on-premises and in the cloud.
Appliance-wise there are now five SKUs, with starting capacities spanning from 96TB to over 1PB of raw storage. Qumulo Core 2.0 software is generally available now. The QC40, QC104, and QC260 appliances will be generally available in Q2. Pricing for an entry level four-node QC24 cluster running Qumulo Core 2.0 starts at $50,000. ®