Teradata adds hardware compression to data warehouses
Compress this, Larry
Teradata invented the data warehousing market and is not about to let Larry Ellison take it over without a fight.
At this week's Partners Conference in San Diego, which is where Teradata Labs does the development of some of the hardware and software that goes into its various data warehousing appliances, the company is previewing its next-generation Data Warehouse Appliance 2690 machines. The devices sport compression co-processors that will do two things: squeeze more data onto the appliance's disk drives and boost the performance of the systems, which now are moving around compressed rather than uncompressed data.
In case you can't keep track of Teradata's various warehouse systems and appliances, here's a quick graphic:
Teradata's data warehousing systems lineup
All of the machines are based on x86 servers that are OEMed from Dell and include Teradata's own eponymous massively parallel database.
Scott Gnau, president of the Teradata Labs division, tells El Reg that the forthcoming 2690 appliances will be able to process queries about twice as fast and store around three times the data in the same footprint as the 2650 appliances that they will replace. The appliance server nodes have been equipped with a co-processor specifically to compress and uncompress database files and is explicitly tuned to work with the Teradata 14 parallel database announced two weeks ago.
Teradata 14 is significant because it allows for columnar data structures to be stored in the database, which are more amenable to certain kinds of data compression than row-based data. With columnar data, the increase in performance and compression will be higher than on regular row-based data, Gnau says.
Teradata is not saying what ASIC it is using to do the zipping and unzipping of data, but calls the chip a "data compression engine" that compresses data at the block level on disks. Gnau said that it is not using the Xeon 5600 processors inside the data warehouse appliances to do the compression, but rather offloading the job to co-processors. Gnau did confirm that this ASIC is welded onto a PCI-Express 2.0 card, and it seems likely that all of the Teradata machines will eventually have it slipped into their slots.
It is not clear how many of these compression engines are put into a single two-socket Teradata server node, but it wouldn't be surprising to see one for each CPU in the system. Teradata isn't saying what compression algorithms it is using, either. But it did say that this block-level compression is turned on by default and that all user data entered into a 2690 machine would be compressed and performance was optimized assuming compression. System tables and other system-level data is not compressed since they are used continuously as the 2690 runs.
The Data Warehouse Appliance 2690, which is the fifth generation of hardware from the company, will run Teradata 13.10 or higher. So it doesn't require the new database version that will ship in December. The machines run SUSE Linux Enterprise Server 10 SP3 and have two six-core Xeon X5675 processors, which Intel launched in February as it ramped up the clock speeds on the Xeon 5600 chips. The 2650 appliances, which were announced this time last year, used the slower 2.93GHz processors. The server nodes in both the 2650 and the 2690 are fully populated with DDR3 main memory.
Using 300GB 10K RPM SAS drives, Teradata can give 18.2TB of usable database space in a rack without any compression; it also offers 600GB and 900GB disk options. The machine can scale across six racks using the Teradata Bytnet software running on top of Ethernet switches, and with those 900GB drives it can present 315.5TB of usable space for data warehouses to frolic within, and delivers a scan rate of 38GB/sec per cabinet.
The Data Warehouse Appliance 2690 machines went into beta testing in September and will be available in the first quarter of 2012. Pricing has not yet been set.
In addition to previewing the new hardware, Teradata also announced a software tool called Unity, which has been in development for two years. Unity does query routing and database synchronization and updating across a Teradata system. Another tool called Data Mover plugs into Unity, allowing for data to be moved more easily between different Teradata systems, and Multi-System Manager, which provides a single console for managing multiple data warehouses. ®