Original URL: http://www.theregister.co.uk/2010/10/25/teradata_appliance_refresh/

Teradata pumps data warehouses with six-core Xeons

Flashy BI box cures performance anxiety

By Timothy Prickett Morgan

Posted in Servers, 25th October 2010 16:21 GMT

Teradata might be the pioneer of data warehousing on cheap x64 server clusters and the use of appliance packaging to tune machines and their software to attack specific workloads, but Oracle and IBM want to eat Teradata's lunch. And its breakfast and dinner, too. That means Teradata has to keep upgrading its hardware and database software and partnering to bring more functionality onto its data warehouse and analytics appliances, making them more useful to the customers who shell out big bucks for them.

At the Partners user group conference in San Diego today, Teradata launched a completely refreshed lineup of entry, midrange, and high-performance data warehousing and analytics appliances, and also tossed in a new flash-heavy extreme performance machine to take on Oracle's new Exadata X2-8 appliances and IBM's Smart Analytics System (SAS) appliances as well as the Netezza appliances that will soon become part of the Big Blue product catalog unless someone swoops in and tries to steal away Netezza. (This means you, NEC or Dell.) EMC is also putting the heat on Teradata with its own Greenplum Data Computing Appliances, and has much deeper pockets than a free-standing Greenplum could ever hope to have.

In a way, the fact that Oracle, IBM, and EMC are stepping up their game into BI and analytics is good for Teradata because these companies have much bigger marketing budgets than Teradata and any expansion of awareness is good for all players. With Teradata boasting a market capitalization of $6.4bn and an acquisition cost of at least $10bn, Teradata is too expensive for Oracle, IBM, or EMC to acquire without doing a merger or taking out some big loans. And that means Teradata, which had $1.82bn in sales and $288m in net income in the trailing four quarters, is going to have to compete against bigger firms that may not share its deep expertise but which have plenty of data center customers who trust their iron and software just the same.

First and foremost, Teradata has to stay in the performance game, and that means getting current on new Xeon-based server iron a bit faster than it has done thus far as well as giving up and playing the TPC-H benchmarketing game to prove the bang and the bang for the buck that its x64 clusters running the eponymous database software can deliver for customers.

Today, Teradata is delivering new server nodes for its clusters based on the "Westmere-EP" Xeon 5600 processors, which are the six-shooter x64 processors that Intel debuted way back in March of this year. The new six-core chips offer roughly 40 to 50 per cent more raw performance than the quad-core Xeon 5500 processors used across the Teradata appliance lineup. And Teradata says that with the new Teradata 13.10 clustered database, also announced today alongside the new Westmere-EP iron, has been tuned not only for the Xeon 5600s but also to take full advantage of the HyperThreading simultaneous multithreading (SMT) in the chips and therefore each two-socket node in the Teradata clusters makes full use of its 24 threads.

The performance gains depend on what Teradata appliance customers buy, and in some cases, the addition of flash disks as well as new data compression techniques, combined with the CPU performance bump, are yielding big performance increases for Teradata machines.

As El Reg previously divulged, Teradata resells tweaked versions of Dell PowerEdge servers as well as LSI and EMC storage in its various appliances.

At the bottom of the refreshed Teradata lineup is the Data Mart Appliance 560, a single-rack appliance based on two-socket server nodes using Intel's six-core 2.93 GHz Xeon 5670 processors and supporting up to 48 GB of DDR3 main memory. (The spec sheet says DDR2, but that is not possible.) The DMA 560 has four hot-swappable 10K RPM SAS disks with 600 GB of capacity, and has various network adapters to link it to the outside world as well as ESCON and FICON adapters for linking the appliance and its database to IBM mainframes.

Up to three 24-drive storage trays can be added to the box, using either 300 GB or 600 GB disks in 2.5-inch form factors, providing either a 5.8 TB or 11.7 TB user data capacity for customers. (Teradata recommends RAID 1 mirroring, but does not require it, RAID 0 striping and RAID 5 data protection are also supported in the disk controllers embedded in the servers.) The cluster runs Novell's SUSE Linux Enterprise Server 10, the Teradata 13.10 database, and has a management console that runs on Microsoft's Windows Server 2003. This DMA 560 machine is being positioned not only as an entry data mart box, but also as a BI application test and development machine.

The flagship product in the Teradata lineup is the Active Enterprise Data Warehouse 5650, which is a multi-rack solution that scales up to 86 PB of user data capacity in the warehouses. Teradata says that by upgrading to the new processor nodes and the Teradata 13.10 database, customers using the current 5600H nodes will see around a 43 per cent performance boost per node. Teradata is putting 300 GB and 450 GB Fibre Channel disks spinning at 15K RPM in the Storage 6844 arrays it peddles for the data warehouse racks.

The EDW 5650 warehouse comes in two flavors. If customers want to mix and match the new nodes with prior nodes, they have to buy the 5650C server nodes, which only have one six-core processor and 48 GB per node plus the BYNET V4 proprietary point-to-point, fault-tolerant interconnect that Teradata uses in many of its warehouses and support for up to 11.2 TB of capacity per node without compression (13.8 TB with 30 per cent compression) with 100 drives per node. The 5650H nodes have two Xeon X5670 processors and up to 96 GB of memory and 188 disk drives per node.

That works out to 21 TB of user capacity per node with 450 GB drives and 26 TB with typical data compression rates. In a standard configuration, the EDW 5650 data warehouses can scale to 1,024 nodes, but if you need more crunching, Teradata can push it up to 4,096 nodes. That 86 PB maximum is for 4,096 nodes without any compression.

The EDW 5650 can support SUSE Linux Enterprise Server 10 or Windows Server 2003; both have to be at 64-bit version levels. No word on when SLES 11 and Windows Server 2008 R2 will be supported.

Extreme data

The Extreme Data Appliance lineup, which is aimed at deep-dive analytics and analytical archiving jobs, is updated today with the EDA 1650, which sports the six-core Xeon X5670s in the server nodes and now 2 TB of drives, boosting the capacity of the appliance to 187 PB. (The EDA 1650 only supports SLES 10 and runs the BYNET software stack atop an redundant Ethernet backbone, akin to InfiniBand over Ethernet or Fibre Channel over Ethernet.

The Data Warehouse Appliance 2650 is also out today, again with the Intel six-shooters (both sockets and 96 GB of memory) and sporting 2.5-inc drives in 300 GB or 600 GB capacities or 2.5-inch drives with 2 TB capacities, yielding 16.4 TB of user space for 300 GB disks, 32.2 TB using 600 GB disks, and 54.9 TB using 2 TB drives. The DWA 2650 can cram nine server nodes into a single cabinet, plus its disks, and using the new data compression algorithms in the Teradata 13.10 database and the extra CPU power, customers are seeing up to 3.3X the performance improvement, rack to rack, with the prior generations of machines.

This machine runs SLES 10 underneath the Teradata 13.10 database, and in fact, any Teradata 12.0 or higher release will run on the box of you don't want to upgrade your software just yet. Like the EDA 1650, the DWA 2650 runs the BYNET software stack atop Ethernet switches. The DWA 2650 is aimed at being a departmental warehouse that scales to six racks or 343 TB (not PB) of capacity.

Teradata EPA 4600 Appliance

The flashy EPA 4600 appliance from Teradata

Which leaves the brand new box that is based on flash storage instead of disk storage, the Extreme Performance Appliance 4600, which Teradata says has "blazing speed for hyper-analytics." The company says that this is the first data warehouse appliance to rely solely on flash-based storage, and that the EPA 4600 can deliver up to a factor of 18X times improvement in decision support query rates compared to the Active EDW data warehouses using spinning disks, with average query times almost four times faster.

The Teradata 13.10 database release has been tweaked to know how to make best use of solid state disks and makes full use of the Teradata Active System Management workload and performance management tools for the appliance family.

The EPA 4600 has the redundant 10 Gigabit Ethernet backbone linking the server nodes and running the BYNET fault tolerant software stack. The server modes used in this appliance are based on the quad-core Xeon 5500 series processors, not the six-shooter 5600 series, for reasons that Teradata did not explain. The server nodes can have up to 96 GB of memory, four 450 GB disks (if you want to go there), and link out to SSD disk drives over 6 Gb/sec SAS links. Each server has four SAS links and can attach to a solid state disk tray that has eight 300 GB SSDs. Teradata has chosen Pliant Technology's Lightning flash drives (which come in 150 GB and 300 GB sizes and deliver 160,000 I/Os per second) and LSI's SAS controllers to attached them to the server nodes.

After formatting the 300 GB SSDs down considerably to extend their productive life, Teradata puts 24 drives in two trays, and carves them up into three segments to feed three different nodes. The user capacity per node is only 1 TB each with 40 per cent data compression activated, and 713 GB with no compression. That is giving back an awful lot of the capacity of the SSD drives, but customers want to get a lot of years out of their data warehouses.

A fully scaled EPA 4600 can give companies 17 TB of data space to chew on, but it can do it very fast thanks to the IOPs of the SSDs. The machine scales up to 24 nodes. The EPA 4600 only supports SLES 10, by the way.

The interesting thing that Teradata did not announce this morning is that it was putting flash drives into all of its various appliances to boost performance across the line. It is clear from the short-stroking on capacity that Teradata is using on SSDs (to borrow a phrase from the spinning disk industry) that the company does not yet fully trust the reliability of SSDs for long-term and heavy use, and rightly so since this technology has not been proven in the field under the kinds of conditions that data warehouses operate in.

It would not be surprising to see flash drives eventually embedded in all of Teradata's appliances. Especially if all of Teradata's rivals do it. Oracle is flash happy at the moment in its Exadata appliances, but is using flash to front-end disks and provide burst data rates to the server nodes that do the SQL crunching in the X2-8 warehouse. IBM is sprinkling SSDs from Fusion-io into its Smart Analytics System 5600 setups, also x64-based rack servers and the most similar boxes to what Teradata is peddling in the Blue appliance fleet.

The updated appliances will start shipping today. Teradata does not provide pricing on its data warehousing and analytics appliances, but Randy Lea, vice president of product marketing at Teradata, said that all of the appliances will have better bang for the buck and the company sliced its prices a bit on the base hardware, so this is not just a more oomph for the same dollars play. The biggest price cuts were on the DWA 2650 machines, according to Lea. If you want to see a rationalization of Teradata's TCO pricing methodology without any actual prices, check this out. ®