Feeds

Nvidia stretches Tesla GPU coprocessors from HPC to big data

'Anything a CPU can do, a GPU can do better'

Designing a Defense for Mobile Applications

GTC 2013 Graphics chip maker Nvidia has barely begun to put a dent in the traditional high performance computing segment with its Tesla GPU coprocessors and it is already gearing up to take on new markets. The next target is big data, and as with parallel supercomputing, Nvidia is hoping to get the jump on rivals Intel and AMD, which peddle their respective x86 and GPU coprocessors.

As it turns out, a GPU is not just good at doing floating point math in single precision or double precision, but it can also be used to sift through streams of data to index it and sort it. It takes a bit of work to repurpose these jobs for a GPU, which have been run on CPUs for the most part, but Nvidia is seeing more and more companies using its Tesla GPUs to augment the indexing, sorting, and otherwise chewing of large data sets.

This will be on of the themes that Jen-Hsun Huang, co-founder and CEO at Nvidia, covered during his opening keynote at the GPU Technology Conference in San Jose on Tuesday. Sumit Gupta, general manager of the Tesla Accelerated Computing business unit at Nvidia, gave El Reg some examples of the kind of adjacent big data jobs where GPUs are being deployed.

The first example comes from SaaSy CRM software vendor Salesforce.com. As it turns out, Salesforce.com is one of the six companies in the world that has full access to the Twitter firehose. The number of tweets per day was miniscule back in 2007 through 2009, but in 2010 it jumped to about 50 million tweets per day, busted through 200 million per day in 2011, and broke through 500 million last year.

If you want to chew on all that data to do sentiment analysis for CRM customers, as Salesforce.com most certainly wants to do, then you would need at least ten times the iron you needed three years ago to get the job done.

Just like sticking with regular server processors is not practical in terms of cost, bang for the buck, or performance per watt for a lot of traditional HPC shops, Salesforce.com found the same issues in the big data munchers it had to store and index the raw Twitter feed.

For one thing, so-called real-time text searches on its CPU clusters holding the Twitter data were taking up to ten minutes to complete, which is better than running in batch mode, but cannot be considered real time.

So Salesforce.com engineers figured out how to index the raw Twitter feed using GPUs to accelerate the CPUs, and then also uses GPU offload to match a search term against that index to create baby Twitter feeds that can in turn be pumped over to customers and meshed with their CRM apps. With the GPU assist, Salesforce.com has been able to reduce its Twitter text search down to one second or less, and that is basically real-time.

Salesforce.com is not, however, talking about the infrastructure that makes up its Twitter search engine. Marc Benioff would have to kill you if you found out. The odds favor some kind of NoSQL data store running on an x86 cluster, of course.

The Shazam music identification service has also shifted to a ceepie-geepie architecture to sort through the digital fingerprints that it creates to identify songs from any snippet anywhere in the song. The move to GPUs was precipitated by a trebling of searches and records in the past year.

In 2011, the company supported 100 million user inquiries to identify a song snippet, chewing through 10 million records. Last year, the service added hundreds of the then-shipping "Fermi" Tesla GPU coprocessors to its cluster and was able to process 300 million inquiries and index and sort over 27 million records.

Gupta says that by offloading a lot of the work from CPUs to GPU coprocessors Shazam can grow its service to cover more music and do searches faster, too. All while keeping the server footprint down. No word on when Shazam will move to "Kepler" family of Tesla GPU coprocessors, but like other vendors, Shazam has a qualification cycle for adopting new technologies and is working through that process now.

Over at Cortexica Vision Systems, it was not even possible to do the visual recognition that the company is bringing to the shopping experience without some sort of cheap computing as you can get in a coprocessor that is based on some sort of parallel architecture.

The system that Cortexica has cooked up allows shoppers to take a picture of an item they want to comparison shop. Each photo is uploaded into the Cortexica parallel ceepie-geepie, with over 1,000 different points of identification taken from that photo.

Cortexica has built a database of over 1 million apparel items, and can find matches or near matches for the items in a matter of seconds based just on the photos. Cortexica is not a retailer itself, but rather offering its database and search algorithms as a service that shopping sites can embed in their services.

This photo matching technology has obvious applicability in other kinds of applications, some of them potentially of a dubious or nefarious nature. That has always been true of every technology humans have created.

Another use for GPU coprocessors that Huang will be talking about is the reformatting of live video feeds using ceepie-geepie appliances from Elemental.

This was the company that supplied the live encoding of video streams from the Summer Olympics in London last year, and is also used by the Weather Channel to do the same job for its video streams. The latest Elemental boxes have Nvidia's Tesla K10 GPU coprocessors paired one-for-one to an x86 processor in a two-socket rack server.

The Weather Channel serves up video to 38 million viewers on mobile devices each month, and during Superstorm Sandy last October the site handled 12 million concurrent live video streams over the Internet, and each one of those streams was re-encoded on the fly from the live feed from Weather Channel studios to meet the screen size and resolutions of various PCs, smartphones, and tablets. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.