Oracle and IBM fight for the heavy workload
Netezza in the net
If you want to do data warehousing, IBM acquired Netezza in September 2010 for $1.7bn, and for good reason. Netezza is an upstart data warehouse appliance maker which has heavily customised the open source PostgreSQL database and created a field programmable gate array (FPGA) co-processor that does SQL pre-processing, much like Oracle does with the Exadata storage servers inside its Exadata clusters.
Not only that, the company built its TwinFin appliances on IBM's BladeCenter blade servers and looked to be getting just a bit too cosy with Japanese server maker NEC. IBM could not afford to let Netezza fall into the hands of competitors. It stands to reason that over the long haul, IBM will be able to take some of the techniques used in the Netezza appliances and apply them to its various parallel database systems.
The original Netezza appliances were based on Power architecture (and did not come from IBM, but one of its OEMs). The TwinFins, based on IBM's blades, came out in August 2009. They pair an HS22 two-socket Xeon 5500 blade with a co-processor blade which has eight FPGAs on it – one for each x86 core.
This combination is called an S-Blade; the FPGA speeds up the filtering of data moving off storage before being passed on to the PostgreSQL database, as well as doing complex sorting and joins of database tables and managing compression.
In the wake of the acquisition, IBM rebranded this parallel database machine the Netezza 1000. A single rack of the Netezza 1000 offers 12 of these S-Blades, which have a total of 96 x86 cores (the same as the Exadata X2-2). The machine has 32TB of usable data space uncompressed, and offers load rates of 3TB per hour and backup rates of 4TB per hour.
The HS22 blades run Red Hat Enterprise Linux 5.3, and IBM has tools to port databases from DB2, Informix, SQL Server, MySQL, Oracle, Teradata, Sybase and Red Brick databases to the Netezza variant of PostgreSQL.
The Netezza 1000 appliance can scale up to ten racks in a single image, two more racks than the Exadata appliances.
In June IBM fattened up the Netezza appliances with a C1000 machine, which has four S-Blades and a dozen disk enclosures in a rack with 144TB of uncompressed database space. (There are obviously a lot fewer processors to chew through this data per rack.)
The machine scales up to eight racks, offering 32 S-Blades with 256 cores and FPGAs and 1.15PB of user space. Netezza gets a little less than a 4:1 ratio with data compression, boosting that user capacity.
IBM charges about $2,500 per terabyte for the high-capacity Netezza appliances and about $10,000 per terabyte for the regular appliances. Disks are cheaper than CPUs and FPGAs.
The other IBM machines
While Oracle was busy eating Sun Microsystems and before IBM had acquired Netezza, IBM rolled out a line of parallel data warehousing and analytics machines call the Smart Analytics System.
The original machines were based on clusters of mid-range Power 550 servers configured with dual-core Power6 processors, 32GB of memory and Gigabit Ethernet switches from Juniper Networks linking the nodes together. Fibre Channel adapters linked out to shared DS5300 disk arrays, cross-coupled to four server nodes.
The server nodes in the original Smart Analytics System ran AIX 6.1 and IBM's General Parallel File System, as well as Tivoli System Automation to manage each node. One node in the cluster is equipped with Cognos 8 modules, including BI Server, Go Dashboard and BI Samples.
The other three machines are carved up into a dozen logical partitions that run IBM's InfoSphere Warehouse variant of DB2 V9.5 database and offer 12TB of user space. The setup could expand to 53 database nodes, supporting about 5,000 named users and offering 200TB of space in 19 racks.
In April 2010, IBM widened the Smart Analytic System fleet to include a Power7-based system cluster. It also added variants based on System x x86 servers and System z mainframes, while updating the underlying database to the DB2 V9.7 release and offering flash storage as an option to boost performance on I/O heavy SQL processing.
The Smart Analytics System 5600 is based on IBM's System x3650 M3 server, a 2U rack server that has two Xeon 5600 series processors from Intel. IBM is plunking the six-core Xeon X5670 running at 3.33GHz into the server nodes, with 8GB, 32GB or 64GB memory options.
The machine can have up to 288GB of main memory and has room for 16 disks or SSDs. These server nodes are paired with IBM's DS3500 disk arrays. They have 24 x 2.5in media bays, can support up to 192 devices using add-on enclosures, and can attach to the servers through Fibre Channel or SAS adapters.
The server nodes run SUSE Linux Enterprise Server 11. The software stack includes the same InfoSphere Warehouse extensions to DB2 and Cognos 8 analytical tools as the original Smart Analytics System, and Fusion-io SSDs are options to disks for boosting IOPs.
IBM is also offering an x86 variant of the Smart Analytics System called the 5710, which is based on the System x3630 M3 server. This is a 2U rack server with 14 x 3.5in or 28 x 2.5in disks. The machine is based on 3.06GHz Xeon X5667 processors and maxes out at 192GB of main memory. So you can put more disks in this 5710 node but less memory than in the 5600 node.
The Smart Analytics System 7700, announced in October 2010, is an upgrade of the original cluster, and is based on IBM's entry Power 740 server announced in August 2010. The Power 740 is a two-socket server that can be equipped with Power7 processors with four, six or eight cores running at between 3.3GHz and 3.7GHz. The machine tops out at 256GB, but should be able to do 512GB once IBM supports 16GB DDR3 memory sticks in the box.
Depending on the model, you can have six or eight 2.5in 600GB SAS drives in the unit. This 7700 parallel system uses the same DS3500 external disk arrays and runs the InfoSphere Warehouse and Cognos 8 stack on AIX – interestingly, on AIX 6.1 still, not on the AIX 7.1 that was announced last year and tuned for Power7 iron.
IBM's Smart Analytics System 7700 DW/BI cluster
Finally, because IBM has to show love to its mainframe base, it also rolled out the Smart Analytics System 9600, which uses z/OS partitions on a System z mainframe as the database server and z/Linux partitions to run the Cognos analytics.
In a base configuration, the setup is based on the System z10 BC midrange mainframe. (It has not yet been updated to the System zEnterprise 114 announced in July with faster mainframe engines.)
The mainframe version of the Smartie box has two partitions on an entry BC-class mainframe, one running z/OS and DB2 and the other running Linux and the Cognos tools. The machine can run databases up to 100TB in size, and you can expand capacity with Parallel Sysplex clustering as needed. IBM tosses in some DS8700 external storage arrays to hold the data, and says that the 9600 box can support up to 10,000 users.
Pricing information was not announced for the Smart Analytics System 9600, but a rack of the 5600-class machines runs to $2.14m, while the 7700-class racks costs $4.49m. If you want to flesh out the 5600-class rack with Fusion-io SSDs, then you are in for $3.68m.
IBM is not yet offering SSDs on the Smartie 7700s, but probably will when the Power7+ processors come out.
Next page: Parallel universe: SunCluster and Exalogic