Feeds

IBM parks parallel file system on Big Data's lawn

Mirror, mirror on the wall, who is the fattest of them all?

Boost IT visibility and business value

The IT universe is seeing a massive collision taking place as the worlds of high-performance computing, big data and warehousing intermingle. IBM is pushing its General Parallel File System (GPFS) further to broaden its footprint in this space, with the 3.5 release adding big data and async replication features as well as customer metadata and more performance.

GPFS is a large-scale file system running on Network Shared Disk (NSD) server nodes with the file data spread over a variety of storage devices and users enjoying parallel access. We got the GPFS 3.5 news from Crispin Keable, IBM's HPC architect based at Basingstoke.

The new release has Active File Management, an asynchronous version of the existing GPFS multi-cluster synchronous replication feature, which enables a central GPFS site to be mirrored with other remote sites, where users then get file access to the mirrored at local instead of wide area network speed. The link is duplex, so updates at either side of it are propagated across.

If the link goes down, the remote site can continue operating using the effectively cached GPFS data. Any updates are cached too, and as a way of preventing old data re-writing more recent data, the update of the central site from an offline remote site coming back online can be restricted to data newer than a pre-set date and time.

One thing to bear in mind is that there is no in-built deduplication in GPFS. If you wanted to reduce the data flowing across such a mirrored link you'd need something like a pair of Diligent dedupe boxes either side of it, or else use some other WAN optimisation/data reduction technique.

RAID and Big data

In petabyte-scale GPFS deployments there can be a thousand or more disks – and disks fail often enough for a RAID re-build to be going on somewhere in the deployment all the time. This limits GPFS performance to the performance of the device upon which the rebuild is taking place.

Keable says that, in de-clustered RAID, the NSD servers farm out GPFS to clients and have spare CPU capacity. They can use this to run software RAID. GPFS deployments can have data blocks randomly scattered across JBOD disks and this provides a stronger RAID scheme than RAID 6, says Keable. The big plus here is that it spreads the RAID re-build work across the entire disk farm, which helps the GPFS's performance to rise. Keable says this feature, which is a block-level algorithm and so capable of dealing with ever-larger disk capacities, was released on POWER 7.

He said IBM expected GPFS customers to use flash storage with de-clustered RAID "to hold its specific metadata – the V-disk as it's called."

GPFS is pretty much independent of what goes on below, the physical storage.

GPFS 3.5 can also be run in a shared-nothing, Hadoop-style cluster and is POSIX-compliant, unlike Hadoop's HFS. Keable says GPFS 3.5 is big-data capable and can deliver "big insights" from what he termed a "big insight cluster". This release of GPFS does not, however, have any HFS import facility.

Fileset features and metadata matters

Prior to GPFS 3.5, a sysadmin could take part of a GPFS file system tree, a fileset, and put it on a specific set of disks to provide a specific quality of service, such as faster responses from a set of fast Fibre Channel drives. The filesets can be dynamically moved without taking the filesystem down and the sysadmin can move data across disks' tiers on a per-day or some other time unit basis.

The fileset has an "i-node" associated with it – an i-node being a tag and a block of data – which points to the actual file data and contains metadata such as origination date, time of first access, etc. The GPFS stored all the fileset metadata on one system. With 3.5, the fileset metadata is no longer mixed but separated out and this has enabled fileset-based backup, snapshot, quotas and group quota policies to be applied. Previously backup policies were applied at the filesystem level, but now, Keable says, "We can apply separate backup policies at the fileset level. It makes the GPFS sysadmin's job easier and more flexible."

Because of this change GPFS has gained POSIX.0-compliance, which means the i-node can contain small files along with their metadata. So you don't have to do two accesses to get at such small files – for example one for the i-node pointer and then one more for the real data – as the i-node metadata and small file data are co-located.

It gets better. A customer's own metadata can be added to the i-node as well. Keable says you could put the latitude and longitude of the file in the i-node and enable location-based activities for such files, such as might be needed in a follow-the-sun scheme. You could do this before but the process was slow as the necessary metadata wasn't in the i-node.

GPFS object storage and supercomputing

A UK GPFS customer said that this opened the way for GPFS to be used for object storage, as the customer-inserted metadata could be a hash based on the file's contents. Such hashed files could thereby be located and addressed via the hashes, effectively layering an object storage scheme on to GPFS.

We also hear GPFS is involved with the Daresbury supercomputer initiatives. There are broadly three systems at Daresbury: a big SMP one, a conventional X86 cluster and Blue Gene – with some 7PB of disk drive data. GPFs underpins this and fronts a massive TS350 tape library with 15PB of capacity.

GPFS is a mature and highly capable parallel file system that is being extended and tuned to work more effectively with the increasing scale of big data systems as the worlds of scale-out file systems, massive unstructured data stores, high-performance computing data storage, data warehousing, business analytics and object storage collide and mingle, causing an intense and competitive development effort to take place.

IBM is pushing GPFS development hard so that the product more than holds its position in this collision – in fact it extends it. ®

The essential guide to IT transformation

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
Docker kicks KVM's butt in IBM tests
Big Blue finds containers are speedy, but may not have much room to improve
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Gartner's Special Report: Should you believe the hype?
Enough hot air to carry a balloon to the Moon
Flash could be CHEAPER than SAS DISK? Come off it, NetApp
Stats analysis reckons we'll hit that point in just three years
Dell The Man shrieks: 'We've got a Bitcoin order, we've got a Bitcoin order'
$50k of PowerEdge servers? That'll be 85 coins in digi-dosh
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.