Another piece of analytics puzzle snaps home
IBM Buys Platform
Magic Quadrant for Enterprise Backup/Recovery
Blog For the last couple of years, I’ve been yammering about how enterprise analytics (or Big Data, or Predictive Analytics) is going to be the next big thing in business and thus enterprise computing. The major vendors, including IBM, Oracle, HP, and Microsoft, are on board along with pioneers like SAS and Teradata. Everyone is busy building out their respective stories, often aided by purchasing specialized ISVs.
I think IBM’s acquisition of Platform Computing will turn out to be one of their more savvy moves and will have a considerable impact on IBM’s enterprise analytics offerings. Platform Computing is the leader in the market for HPC cluster management and optimization software.
Right now, their offerings are primarily used by traditional scientific computing customers like research labs, life sciences, energy exploration, and the like. They’re the dominant ISV in what is a bit of a niche market (relative to enterprise software), Platform is successful, but with an upside that’s limited by the markets they serve.
The enterprise analytics trend, coupled with the IBM purchase, changes everything for Platform and puts them in position to sell their wares to a much larger set of customers. Some of Platform’s tastiest secret sauce is in the way their offerings allow customers to manage workloads on massive clusters.
Platform was one of the main players in the grid computing boomlet in the early 2000s. For those who weren’t around then, or don’t remember, grid computing allows a large computing job (or jobs) to be parceled out to many heterogeneous nodes.
The grid’s head node manages the job, tracks progress, and assembles the results for presentation to the user. It was a hot technology back in the day but was overshadowed by virtualization and relegated to HPC, where it proved quite useful.
I think that this grid technology is going to be critical in enterprise analytics. The major vendors are primarily pitching analytic hardware/software/service bundles which have great performance – and serious price tags. While these products are a good fit for some customers, typically the largest, it’s not how most customers are going to adopt enterprise analytics.
Most customers are going to dip their toes into the enterprise analytic waters – they aren’t going to dive in. They aren’t going to heavily invest in anything that doesn’t have an airtight business case – and as powerful as enterprise analytics can be, it’s damned hard to justify millions or tens of millions of dollars for a single-purpose system. Customers want to believe that the beans are magic, but they need proof.
To prove out the benefits, they’ll buy the software and a small set of systems, or they’ll run it on gear they already have. This is where Platform and their cluster/grid management software come in. They give customers the ability to deploy analytic tasks opportunistically on large numbers of discrete heterogeneous systems. Platform’s software deploys tasks to sub nodes, monitors them to make sure they’re chugging along, and ensures that the quality of service requirement for the particular job is met.
These capabilities are somewhat akin to what you might find in a full-featured virtualization management suite, but the virtualization providers come up a bit short when it comes to granular workload management on a large number of systems. In simple terms, Platform gives customers the ability to spread a computing task to as many different systems as necessary to complete the task in the timeframe allowed – while not disturbing other work that’s happening on those systems.
Platform gives IBM the ability to craft highly integrated technical computing or enterprise analytics bundles that are optimized for performance – which is a good thing. But Platform also gives IBM the foundation of a data center or enterprise-wide analytics infrastructure that will give customers the ability to run analytic workloads alongside their typical day-to-day processing.
This will deliver number-crunching goodness at a much lower price tag than the ‘Big Data in a box’ offerings and is a good, low-risk way for newbie customers to get into enterprise analytics.
COMMENTS
Microsoft is a major vendor?
"Microsoft is a major vendor"?
Isn't that a bit of a quaint statement now?
The problem is...
...this article sounds as though it's written entirely from information gleaned from somewhere like Gartner. Having once worked for a company which rates very highly in several Gartner categories, despite the actual quality of the offering, I can assure you that trends and predictions based on that sort of information are utterly useless.
About the best you can do is have experienced teams working on your projects. They will work around whatever tools corporate have imposed on them and make it work. If you have inexperienced teams, they won't be able to.
CPU yes, but what about storage
Although the author's assertion that running on existing hardware will lower cost of big-data analytics, a key point of "Big Data" is not CPU-load, it is "you have lots of low value data to work with". Platform doesn't address that story; they may have better scheduling than Apache Hadoop's out the box schedulers, but their storage story is the same: run HDFS for location-aware storage.
No doubt IBM's story will become that of IBM's grid story: use GPFS, but that increases the cost of storage in exchange for location-independence, which limits the amount of data you can retain.

IT infrastructure monitoring strategies
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist
Steps to Take Before Choosing a Business Continuity Partner
Enabling efficient data center monitoring