Nvidia stretches Tesla GPU coprocessors from HPC to big data

'Anything a CPU can do, a GPU can do better'

Combat fraud and increase customer satisfaction

GTC 2013 Graphics chip maker Nvidia has barely begun to put a dent in the traditional high performance computing segment with its Tesla GPU coprocessors and it is already gearing up to take on new markets. The next target is big data, and as with parallel supercomputing, Nvidia is hoping to get the jump on rivals Intel and AMD, which peddle their respective x86 and GPU coprocessors.

As it turns out, a GPU is not just good at doing floating point math in single precision or double precision, but it can also be used to sift through streams of data to index it and sort it. It takes a bit of work to repurpose these jobs for a GPU, which have been run on CPUs for the most part, but Nvidia is seeing more and more companies using its Tesla GPUs to augment the indexing, sorting, and otherwise chewing of large data sets.

This will be on of the themes that Jen-Hsun Huang, co-founder and CEO at Nvidia, covered during his opening keynote at the GPU Technology Conference in San Jose on Tuesday. Sumit Gupta, general manager of the Tesla Accelerated Computing business unit at Nvidia, gave El Reg some examples of the kind of adjacent big data jobs where GPUs are being deployed.

The first example comes from SaaSy CRM software vendor Salesforce.com. As it turns out, Salesforce.com is one of the six companies in the world that has full access to the Twitter firehose. The number of tweets per day was miniscule back in 2007 through 2009, but in 2010 it jumped to about 50 million tweets per day, busted through 200 million per day in 2011, and broke through 500 million last year.

If you want to chew on all that data to do sentiment analysis for CRM customers, as Salesforce.com most certainly wants to do, then you would need at least ten times the iron you needed three years ago to get the job done.

Just like sticking with regular server processors is not practical in terms of cost, bang for the buck, or performance per watt for a lot of traditional HPC shops, Salesforce.com found the same issues in the big data munchers it had to store and index the raw Twitter feed.

For one thing, so-called real-time text searches on its CPU clusters holding the Twitter data were taking up to ten minutes to complete, which is better than running in batch mode, but cannot be considered real time.

So Salesforce.com engineers figured out how to index the raw Twitter feed using GPUs to accelerate the CPUs, and then also uses GPU offload to match a search term against that index to create baby Twitter feeds that can in turn be pumped over to customers and meshed with their CRM apps. With the GPU assist, Salesforce.com has been able to reduce its Twitter text search down to one second or less, and that is basically real-time.

Salesforce.com is not, however, talking about the infrastructure that makes up its Twitter search engine. Marc Benioff would have to kill you if you found out. The odds favor some kind of NoSQL data store running on an x86 cluster, of course.

The Shazam music identification service has also shifted to a ceepie-geepie architecture to sort through the digital fingerprints that it creates to identify songs from any snippet anywhere in the song. The move to GPUs was precipitated by a trebling of searches and records in the past year.

In 2011, the company supported 100 million user inquiries to identify a song snippet, chewing through 10 million records. Last year, the service added hundreds of the then-shipping "Fermi" Tesla GPU coprocessors to its cluster and was able to process 300 million inquiries and index and sort over 27 million records.

Gupta says that by offloading a lot of the work from CPUs to GPU coprocessors Shazam can grow its service to cover more music and do searches faster, too. All while keeping the server footprint down. No word on when Shazam will move to "Kepler" family of Tesla GPU coprocessors, but like other vendors, Shazam has a qualification cycle for adopting new technologies and is working through that process now.

Over at Cortexica Vision Systems, it was not even possible to do the visual recognition that the company is bringing to the shopping experience without some sort of cheap computing as you can get in a coprocessor that is based on some sort of parallel architecture.

The system that Cortexica has cooked up allows shoppers to take a picture of an item they want to comparison shop. Each photo is uploaded into the Cortexica parallel ceepie-geepie, with over 1,000 different points of identification taken from that photo.

Cortexica has built a database of over 1 million apparel items, and can find matches or near matches for the items in a matter of seconds based just on the photos. Cortexica is not a retailer itself, but rather offering its database and search algorithms as a service that shopping sites can embed in their services.

This photo matching technology has obvious applicability in other kinds of applications, some of them potentially of a dubious or nefarious nature. That has always been true of every technology humans have created.

Another use for GPU coprocessors that Huang will be talking about is the reformatting of live video feeds using ceepie-geepie appliances from Elemental.

This was the company that supplied the live encoding of video streams from the Summer Olympics in London last year, and is also used by the Weather Channel to do the same job for its video streams. The latest Elemental boxes have Nvidia's Tesla K10 GPU coprocessors paired one-for-one to an x86 processor in a two-socket rack server.

The Weather Channel serves up video to 38 million viewers on mobile devices each month, and during Superstorm Sandy last October the site handled 12 million concurrent live video streams over the Internet, and each one of those streams was re-encoded on the fly from the live feed from Weather Channel studios to meet the screen size and resolutions of various PCs, smartphones, and tablets. ®

3 Big data security analytics techniques

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
It's GOOD to get RAIN on your upgrade parade: Crucial M550 1TB SSD
Performance tweaks and power savings – what's not to like?
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
IBM rides nightmarish hardware landscape on OpenPOWER Consortium raft
Google mulls 'third-generation of warehouse-scale computing' on Big Blue's open chips
prev story


Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.