Original URL: http://www.theregister.co.uk/2009/07/28/ibm_bao_boxes/

IBM iron predicts the future

BAO boxes combine warehousing, analytics

By Timothy Prickett Morgan

Posted in Servers, 28th July 2009 22:01 GMT

Back in May, at its annual day to preach to IT and Wall Street analysts, IBM laid out is vision for providing real-time, predictive analytic systems that will allow managers to take longer lunches and take credit for ideas that are not their own.

Well, something like that.

IBM wants to take control over the business analytics and optimization (BAO) segment of the IT space, which Big Blue reckons is growing at a significantly faster pace than the traditional back-end systems which its bread and butter come from. At its May confab, it said it would deliver systems that were highly optimized for precise industries to not only do data warehousing, but also provide the aforementioned analytics.

Today, IBM trotted out the first of these specialized systems, which it calls the Smart Analytics System. It also provided a technology preview of an add-on appliance for its System z mainframes, called the Smart Analytics Optimizer, that Big Blue said will substantially speed up queries on those mainframes, thereby not severely impacting the performance of the applications and batch jobs that run on mainframes.

That the Smart Analytics machinery was previewed today, at the same time that IBM announced its $1.2bn acquisition of predictive analytics software maker SPSS, was a coincidence. But IBM needed the predictive analytics software created by SPSS to round out its BAO boxes, which had real-time information culled from data warehouses but which will now be able to predict business - presumably about as well as we now predict the weather or deals like IBM taking over SPSS, which was on no one's radar.

But forget those cynical thoughts for now.

IBM is excited about the BAO opportunity and market that it is largely defining by putting ERP, CRM, SCM, and other transaction-processing systems on one side of the data center and putting data marts, data warehousing, and analytics on the other.

The business automation space, explained Steve Mills, general manager of IBM's Software Group, accounts for about $566bn in sales - 2008 data, repeated from the analyst day event - but the growth here is about 3 per cent compounded over a few years (from 2007 through 2012, actually).

While the BAO opportunity was only about $105bn globally last year, its revenue growth is projected to be 8 per cent over the same term. What Mills did not say today, but what Frank Kern, general manager of IBM's Global Business Services group, said back in May is that within the next five years, Big Blue expects that it will get as much revenue from BAO systems as it drives with ERP systems today.

This BAO opportunity is why IBM shelled out $5bn in November 2007 to buy Cognos, why it is spending $1.2bn today to buy SPSS, and why it has spent over $10bn since 2005 buying software companies or investing in software development for database, data warehousing, and related information-management technologies.

Rather than sell general-purpose servers with a stack of operating system, database, and clustering software with layers of Cognos and SPSS software running atop that, as IBM and other system makers have been doing for years to build data warehouse and analytic systems, the Smart Analytics System will be a tightly integrated, highly optimized setup.

As Mills puts it, such tight integration and optimization is necessary to bring ease of sale and support to IBM and ease of consumption and use for end users. The only way to get orders of magnitude in performance, according to IBM's smarties, is to create hybrid server and storage setups that bring various hardware and software technologies together and sell them as a single unit.

This is increasingly how supercomputers are built and sold, and this is probably how plenty of systems will be sold. Looking ahead, Mills - who is not a systems guy - said that it would not be unreasonable to see hybrid and tuned systems like the BAO boxes account for a double-digit share of ongoing server sales. That's as good as blade servers.

The problem with giving customers highly tuned and optimized systems is that they have so many different workloads that it's hard to generalize. But there are places where IBM can generalize, and do custom integration and programming if necessary for special cases, as it does now.

Mills says that optimized systems are necessary to drive performance up and costs down, but that each industry has its own algorithms, its own data volumes and types, and its own latency requirements. "What you need to do fraud analysis for real-time banking transactions is not the same setup you need for customer loyalty applications at retailers," he warns.

IBM has committed to offer hybrid and tuned machinery for six different areas, and where appropriate, to deliver them both as complete systems - with a single product number, preconfigured and ready to go - and as cloud infrastructure that customers can rent from the IBM Cloud.

These areas include analytics, collaboration, application development and testing, virtual desktop, virtual infrastructure, and business services. Because of data security issues and the sheer volume of data in warehouses, there will not be an IBM Cloud version of this BAO box. But IBM can, and probably will, offer to run and manage a local copy of the BAO box in your own data center, but with CloudBurst cloud infrastructure.

The smarts behind "Smart Systems"

The idea of the "Smart Systems," says Scott Handy, vice president of marketing and strategy for the Power Systems division that creates and sells Power-based servers, is to take the 8,000 engagements that IBM has done relating to workload optimization and push performance up an order of magnitude compared to general-purpose setups.

At the same time, "Smart Systems" will get these optimized systems into the data center faster because IBM will use its deep hardware and software skills and broad industry knowledge to put it all together in a way that requires fewer IT people to install and maintain.

The other idea is to keep the architecture flexible enough that it can adapt and evolve, not like a sealed server-appliance box that's so tightly integrated that it can't be changed without gutting it.

No one wants to do a forklift upgrade to move from Power6 to Power7 iron after they have this running.

The Smart Analytics System is based on IBM's Power 550 midrange servers and includes its AIX operating system, DB2 database, and various Cognos data-warehousing tools. It will come with a variety of add-on modules and, as its name suggests, is designed to be a data warehouse with analytics built in.

Using the Cognos Mixed Marketing test, the BAO box was able to deliver three times the performance of a standard n-tier setup of Cognos software and took 50 per cent less floor space. When the SPSS deal is done, IBM plans to weave in predictive analytics to the real-time analytics that were already part of the Cognos tools.

The BAO box is expected to be announced formally in the middle of September and will ship by the end of September. And it is fairly safe to assume that IBM will be charging a premium to cover some of the integration and optimization of the system components even as it does some discounting because the products are bundled.

IBM gave the impression that the Smart Analytics System will be sold using the tried-and-true "lower total cost of ownership" argument that has been used to push IBM mainframes and minicomputers over the decades.

But it's far from clear that IBM can charge a huge premium over general-purpose data warehouse and data analytics setups, even if it does offer performance and other benefits such as integrated support for the entire stack of hardware and software and a twice-a-year checkup from IBM to make sure the BAO box has all the right optimizations and has not drifted.

IBM did not provide pricing in its Tuesday announcement, and it will be surprising to see prices on the BAO box even in September. What it did say is that the cost of supporting the BAO box will be about 50 per cent lower than a general-purpose data warehouse after the installation is complete, mainly because it takes fewer database administrators, system administrators, and programmers to keep it all humming.

IBM said that it has four customers testing the BAO boxes, and one of them is a "large government agency" that's using IBM and Northrop Grumman as contractors to build a data warehouse that can process 20,000 complex queries a day, has hundreds of terabytes of active data, and 20 petabytes of information crammed into its data warehouse.

The other item IBM talked about on Tuesday, the Smart Analytics Optimizer, is aimed specifically at boosting the database-query performance of the servers that it's attached to. In the fourth quarter, this appliance - and it really is an appliance - will attach to IBM's System z mainframes through an Ethernet link. The DB2 database running on the mainframe will be able to move chunks of data over to its main memory and offload complex queries to this appliance instead of trying to run them on the mainframe itself.

The Smart Analytics Optimizer does not run on the System z Integrated Information Processor (zIIP), a so-called "specialty engine" on IBM mainframes that is about a quarter of the price of a real mainframe engine for supporting z/OS and DB2 that can accelerate certain DB2 functions.

IBM is being vague about exactly what this Smart Analytics Optimizer appliance is, but says the important thing is that it allows mainframe shops to do the kind of complex queries they want to do on mainframes but can't because when their transaction and batch applications are running, queries take hours to run.

This appliance can apparently speed up certain complex queries by as much as 50 times, thanks to vector coprocessors and in-memory processing inside the box. This means that many companies will not even have to set up a data warehouse and copy data to it to do complex queries.

This is key because you can't do real-time analytics on data that is stale because it has been copied out of production systems, and predictive analytics has to be based on the most current data available in those production apps.

IBM said that the appliance will cut the cost of the complex queries that mainframe shops want to run by two orders of magnitude. With no official prices for mainframes, it's hard to reckon what the optimizer appliance will cost. But it would be hard to be more expensive than running a relational database on a mainframe, and it has to be less expensive than a zIIP engine as well, or there's no point in doing the launch.

IBM has not said if these DB2 acceleration appliances can be daisy chained off mainframes or be used in some parallel fashion. But this would be useful. IBM would not confirm that it will support other platforms with the DB2 query appliance, but Mills hinted that it would not be difficult to support other DB2 variants running on other IBM platforms. ®