New standard test of Big Data bang per system buck rolls out

A sim so good you could use it instead of Oracle or SAP?

High performance access to file storage

There's a new big data benchmark in town: TPC-DS.

The Transaction Processing Performance Council still doesn't know how to do its own abbreviation after 24 years of existence, but it does know a thing or two about getting IT hardware and software vendors together and hammering out benchmark tests and pricing metrics to help server, storage, database, and middleware buyers try to figure out what they might want to buy and what kind of value they might expect from what they buy.

The TPC was founded in 1988 following an uproar in the server racket after IBM ran its own RAMP-C COBOL benchmark test on its AS/400 and System/38 minicomputers, pitting them against a bunch of Hewlett-Packard HP 3000s and Digital Equipment VAXes and showing (of course) that the IBM machines won out.

The initial TPC-A debit/credit benchmark that these and other vendors ratified through the TPC consortium was ridiculously simple by modern standards, but then again, so were the things that we were all doing with systems back then.

The TPC-C online transaction processing benchmark, which simulates the data processing associated with running a warehouse (the real kind, with forklifts) and looking up stock items and doing other transactions, is arguably the most successful comparative benchmark in history (certainly among those that provide both performance and pricing), but is getting a bit long in the tooth considering that the whole benchmark test can easily fit in main memory these days and the disk requirements of the TPC-C test are utterly ridiculous.

The TPC-E test, which simulates the data processing of an online brokerage and that supports multi-tier configurations, was supposed to be a replacement for TPC-C when it debuted in March 2007, but relatively few TPC-E benchmark test runs have been done in the past five years – El Reg counts 55 machines, and they are all running the Windows stack – so the usefulness of TPC-E can be seriously called into question.

On the decision support/data warehousing front, TPC-D was the original benchmark, but fighting among the vendors in the TPC consortium caused it to be split into the TPC-H and TPC-R. Basically, some companies were precompiling routines in the ad hoc query benchmark, that was eventually allowed in the TPC-R test, which vendors and users alike eventually shunned as being useless.

Work on a follow-on to TPC-H was started back in September 2004, when the TPC-D, TPCH-H, and TPC-R tests were all active, and the idea then was to get back to a single test. The hope back then was to get TPC-DS ratified in late 2005 and into production in 2006. Clearly, this took a lot longer than expected.

And the TPC-DS test is now no longer being pitched as a kicker to the TPC-H ad hoc query test for data warehouses, but rather as a totally different test that simulates the big data processing associated with a modern retail operation. And thus, you might think, perhaps the TPC consortium should have called it TPC-BG.

What customers have asked for TPC to come up with was a test that can measure the performance of a single user (what it calls a power test), the performance of many users (a throughput test) and continuous data integration (extract, transform, and load, or ETL, work as well as trickle updates to the database).

The TPC says that the TPC-DS test has realistic table content and table scaling, has non-uniform data, and includes NULL values, which represent real-world database challenges. Furthermore, it has a large de-normalized schema, a large query set, complex queries (including tool-generated queries and modern SQL constructs such as SQL99 and OLAP constructs), and a mix of ad-hoc and reporting queries.

You can see the TPC-DS standard specification here (PDF).

The TPC-DS test models the decision support processing for a hypothetical retailer that has to manage a large number of products and sells its products through a nationwide chain of stores as well as having catalog sales and online sales.

The mix of sales is 50 per cent in brick and mortar shops, 30 per cent through catalogs, and 20 per cent through the online store. The simulated system tracks product inventories and ships them from simulated warehouses. It also records purchases, modifies prices according to promotions, creates dynamic web pages, and updates customer profiles. (If you are looking to start up a retail operation, the TPC-DS code might be cheaper than paying Oracle or SAP ...)

Here's the block diagram of the TPC-DS system:

TPC-DS benchmark block diagram

TPC-DS benchmark block diagram

The simulated application has sales, inventory, shipping, planning, marketing, fraud analysis, and customer analysis modules, and the application includes both transaction databases and a data warehouse. In essence, it is a mix of TPC-C and TPC-H. The data warehouse is de-normalized with multiple snowflake schemas being used to create what is called a snowstorm schema. This setup can support the generation of batch reports, ad hoc queries, iterative OLAP queries, and data mining.

It is complicated and hairy – just like real-world decision support systems. The databases include 26 tables, including 7 fact tables (which hold 99 per cent of the total data in the system) and 19 dimension tables that glue them all together for the various types of queries. The tables have 30 or more columns and have a variety of character, decimal, and integer data and are, of course, indexed like hell. But complex data structures like materialized views, bitmaps, and join indexes are only allowed on the catalog sales channel.

Just like the TPC-H test, the benchmark will scale not just by the size of systems but also by the size of the dataset in the fact tables that is chewed upon - with variants coming in 100GB, 300GB, 1TB, 3TB, 10TB, 30TB, and 100TB sizes. The static tables don't scale, just like in the real world.

The TPC-H test adhered to the SQL92 standard and had 22 queries, which were characterized as simple ad-hoc queries. The TPC-DS test has 99 queries, which are a mix of ad-hoc and reporting queries and they adhere to the more modern SQL99 standard plus thrown in some OLAP extensions that are commonly part of relational databases today. And the queries are, of course, a lot more complex.

The TPC-H test had 8 tables, and only 2 of them were updated, while 22 out of the 26 tables in the TPC-DS test are updated. The TPC-H test had data deleted and inserted randomly to simulate change in the database, but TPC-DS actually does rolling updates of data like a real workload would. The largest TPC-H table had 15 columns, and the largest TPC-DS table has 38 columns.

Here's the algorithm for coming up for the composite TPC-DS benchmark score:

The TPC-DS performance metric

The TPC-DS performance metric

Got that?

Scale factor is the dataset size, and the metric counts the throughput in queries per hour. You load up the databases, which does the updates to the dimensional and fact tables and time that. Then you do one user's query stream against it to do the power test. The you do two runs of the multiple query test, where streams of multiple users add queries. Each user stream executes all 99 queries in a random order and only executes one query at a time.

You load up user streams to boost the overall throughput of the system. You do some normalization for user counts and come up with the final metric in queries per hour. You divide by the cost of the system under test (which does not appear to include maintenance) and there's your TPC-DS measured amount of bang per buck.

Vendors can start using TPC-DS immediately. It will be interesting to see if they do. ®

High performance access to file storage

More from The Register

next story
Audio fans, prepare yourself for the Second Coming ... of Blu-ray
High Fidelity Pure Audio – is this what your ears have been waiting for?
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Nokia offers 'voluntary retirement' to 6,000+ Indian employees
India's 'predictability and stability' cited as mobe-maker's tax payment deadline nears
Apple DOMINATES the Valley, rakes in more profit than Google, HP, Intel, Cisco COMBINED
Cook & Co. also pay more taxes than those four worthies PLUS eBay and Oracle
It may be ILLEGAL to run Heartbleed health checks – IT lawyer
Do the right thing, earn up to 10 years in clink
France bans managers from contacting workers outside business hours
«Email? Mais non ... il est plus tard que six heures du soir!»
Adrian Mole author Sue Townsend dies at 68
RIP Blighty's best-selling author of the 1980s
Zucker punched: Google gobbles Facebook-wooed Titan Aerospace
Up, up and away in my beautiful balloon flying broadband-bot
Analysts: Bright future for smartphones, tablets, wearables
There's plenty of good money to be made if you stay out of the PC market
prev story


Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.