Feeds

Microsoft has a BLAST with Azure clouds

Petaflops barrier also broken

Internet Security Threat Report 2014

SC10 Microsoft is a wannabe in the supercomputing arena, but the software giant is making steady progress among quants and life sciences techies with its Windows HPC Server 2008 R2 and now its Azure cloud services.

At the SC10 supercomputing trade show in New Orleans, Microsoft announced that it has a version of the BLAST, which is essentially a search engine for scanning through large databases of chemical compounds to find matches, running on its Azure cloud.

BLAST is short for Basic Local Alignment Search Tool, and life sciences researchers use it to scan the genetic sequence from one animal (say, a mouse) to see if there is a matching sequence in the human genome.

BLAST is all about speed, as the name suggests, not precision, and biologists use it because it can search large genetic sequences more quickly than a more accurate method, known as the Smith-Waterman algorithm. BLAST was developed at the National Center for Biotechnology Information in 1990, which maintains genomics databases that researchers can use to see which animals and plants have matching nucleotides or gene sequences.

Microsoft loaded up the NCBI database onto Azure and demonstrated at SC10 that it can enable scans against 100 billion protein sequences. This is the kind of capability that only the largest life sciences labs can afford, and interestingly, Microsoft is making access to BLAST running on Azure against its copy of the database available to researchers for free.

The Azure cloud runs the same BLAST executables that those running the application locally run on their own gear, and can be run against the same databases that are available from NCBI. You set up an Azure account, you load and configure the BLAST application, and the worker nodes on the Azure cluster download the appropriate databases from NCBI through FTP to store them on Azure Blob storage.

Azure runs the BLAST scans against the data, partitioning it up across multiple nodes to speed up the searches, and spits out the results into another Azure Blob file.

In one test run by Microsoft, researchers at the Seattle Children's Hospital wanted to check the interactions of proteins against each other using BLAST, and researchers figured that scanning the 10 million proteins they were interested in would take about six years to run on a single computer. So the hospital approached Microsoft's Extreme Computing Group and asked if they could get BLAST running on Azure.

Microsoft plunked BLAST onto two Azure data centers in the US and six data centers in Europe, carving up the search job. The Azure cloud chewed through the job in a week - but not before some server nodes failed and others were taken down for routine maintenance. Microsoft was quite happy to report that the Azure cloud healed around this downtime (as large clusters have to do) and work was not lost.

While the use of BLAST and the NCBI databases is free on Azure, the Azure capacity is not unless you are a qualified researcher approved by Microsoft, working through its Global Cloud Research Engagement Initiative.

At SC10, Microsoft was also crowing about how the Tsubame 2 supercomputer in Japan (see the most recent Top 500 coverage for more on this machine) not only broke the petaflops barrier, but did so running Windows HPC Server 2008 R2 and Linux.

This hybrid CPU-GPU machine is by far the most powerful Windows-based supercomputer cluster ever assembled, but it is helpful to remember that it has 1,400 nodes with three Nvidia GPU co-processors each. The amount of work that the Windows or Linux side of the machine is doing is important, but the GPUs are doing most of the calculations.

Microsoft also said that it would be rolling out Service Pack 1 for Windows HPC Server 2008 R2 by the end of the year. The update will include features to allow researchers to link local machines running the HPC variant of Microsoft's server stack to Azure.

The exact nature of that integration is not yet known, but El Reg is looking into it. Data movement between local and cloud server nodes is a big, big issue. ®

Internet Security Threat Report 2014

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
IBM storage revenues sink: 'We are disappointed,' says CEO
Time to put the storage biz up for sale?
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.