Feeds

Microsoft has a BLAST with Azure clouds

Petaflops barrier also broken

Internet Security Threat Report 2014

SC10 Microsoft is a wannabe in the supercomputing arena, but the software giant is making steady progress among quants and life sciences techies with its Windows HPC Server 2008 R2 and now its Azure cloud services.

At the SC10 supercomputing trade show in New Orleans, Microsoft announced that it has a version of the BLAST, which is essentially a search engine for scanning through large databases of chemical compounds to find matches, running on its Azure cloud.

BLAST is short for Basic Local Alignment Search Tool, and life sciences researchers use it to scan the genetic sequence from one animal (say, a mouse) to see if there is a matching sequence in the human genome.

BLAST is all about speed, as the name suggests, not precision, and biologists use it because it can search large genetic sequences more quickly than a more accurate method, known as the Smith-Waterman algorithm. BLAST was developed at the National Center for Biotechnology Information in 1990, which maintains genomics databases that researchers can use to see which animals and plants have matching nucleotides or gene sequences.

Microsoft loaded up the NCBI database onto Azure and demonstrated at SC10 that it can enable scans against 100 billion protein sequences. This is the kind of capability that only the largest life sciences labs can afford, and interestingly, Microsoft is making access to BLAST running on Azure against its copy of the database available to researchers for free.

The Azure cloud runs the same BLAST executables that those running the application locally run on their own gear, and can be run against the same databases that are available from NCBI. You set up an Azure account, you load and configure the BLAST application, and the worker nodes on the Azure cluster download the appropriate databases from NCBI through FTP to store them on Azure Blob storage.

Azure runs the BLAST scans against the data, partitioning it up across multiple nodes to speed up the searches, and spits out the results into another Azure Blob file.

In one test run by Microsoft, researchers at the Seattle Children's Hospital wanted to check the interactions of proteins against each other using BLAST, and researchers figured that scanning the 10 million proteins they were interested in would take about six years to run on a single computer. So the hospital approached Microsoft's Extreme Computing Group and asked if they could get BLAST running on Azure.

Microsoft plunked BLAST onto two Azure data centers in the US and six data centers in Europe, carving up the search job. The Azure cloud chewed through the job in a week - but not before some server nodes failed and others were taken down for routine maintenance. Microsoft was quite happy to report that the Azure cloud healed around this downtime (as large clusters have to do) and work was not lost.

While the use of BLAST and the NCBI databases is free on Azure, the Azure capacity is not unless you are a qualified researcher approved by Microsoft, working through its Global Cloud Research Engagement Initiative.

At SC10, Microsoft was also crowing about how the Tsubame 2 supercomputer in Japan (see the most recent Top 500 coverage for more on this machine) not only broke the petaflops barrier, but did so running Windows HPC Server 2008 R2 and Linux.

This hybrid CPU-GPU machine is by far the most powerful Windows-based supercomputer cluster ever assembled, but it is helpful to remember that it has 1,400 nodes with three Nvidia GPU co-processors each. The amount of work that the Windows or Linux side of the machine is doing is important, but the GPUs are doing most of the calculations.

Microsoft also said that it would be rolling out Service Pack 1 for Windows HPC Server 2008 R2 by the end of the year. The update will include features to allow researchers to link local machines running the HPC variant of Microsoft's server stack to Azure.

The exact nature of that integration is not yet known, but El Reg is looking into it. Data movement between local and cloud server nodes is a big, big issue. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.