Original URL: http://www.theregister.co.uk/2010/05/10/parascale_das/
ParaScale brings direct access to the cloud
Apps run on cloud storage nodes
Cloud storage software provider ParaScale supports applications running directly on storage nodes - not accessing their data across the cloud. Isn't this contradictory?
The idea is that a scale-out cloud file storage architecture is great for storing data, but not good if the applications that need the data are remote, which they will be. This is cloud storage, and needs lots of data. ParaScale says applications should be brought to the data, not the other way around.
ParaScale’s Cloud Storage (PCS) software aggregates multiple standard Linux servers to present one highly scalable virtual file-storage appliance, accessible via standard file access protocols like CIFS, NFS, HTTP, FTP and WebDAV. The main attribute is the ability to scale capacity rather than performance, as is the case with parallel access competitor Panasas.
Last month ParaScale ditched its CEO, Sajai Krishnan, in favour of Ken Fehrstrom. Krishnan used to run NetApp's StoreVault business, the separate product line that was folded back into NetApp's mainstream storage business, and joined ParaScale two years ago. Unlike Krishnan, Fehrstrom has been a CEO before and has networking and storage networking experience. There has been no public announcement of this CEO change.
V2.5 of ParaScale software adds improved data security and better self-healing but the main event is the return as it were, of direct-attached storage (DAS) to the cloud.
Fehrstrom said in a prepared release: "This next generation of cloud storage includes a major paradigm shift – bringing the processing to the data.”
The software is claimed to identify local data and permit processing of it by applications running directly on the storage node. We can see this being useful in a public cloud but less so in a private cloud, ie an enterprise data centre where applications have been accessing DAS data for decades.
Part of ParaScale's pitch is that machine-generated data, such as telemetry, satellite images, backups and log files, is growing much faster than user-generated files such as documents and CAD/CAM images. It says its new software architecture automates the management and analysis of this machine-generated data.
Why ParaScale is singling out machine-generated data from human-generated content is not clear, nor is the algorithm used to differentiate between the two types of data. We might imagine floods of small data uploads from telemetry devices that could be analysed on the spot but satellite images or medical picture files can be huge so it's not simply a case of looking at file size.
ParaScale has also improved data integrity by calculating identifying signatures for chunks of data when they are written and comparing them against freshly computed signatures when the data is read. A difference flags data corruption and the data is restored from a copy. The signatures are stored locally and not in a remote repository. Data can also be encrypted with ParaScale-generated keys and with no need for third-party key management.
It's added multi-tenant FTP support, with a Virtual File System (VFS). Each virtual file system acts as a unique FTP server with user authentication against an external directory to protect shared file access. These VFSs can also be encrypted. If a disk or node is stolen the data on it cannot be recovered.
What is going on here? The security and self-healing stuff is a feature of many cloud storage offerings, but the ability to run apps directly on storage nodes is not. ParaScale says it "optimises performance through parallel, multi-protocol access" but you'd think running apps on storage nodes would compromise storage performance. If the storage node CPU cycles are used for app processing then they are not being used for storage I/O.
Perhaps ParaScale sees a potential future as a cloud data centre utility, offering both cloud data storage and application processing of that data in its cloud. We'll have to see where Fehrstrom takes the company and whether customers, managed service providers we guess, like the idea. ®