Feeds

Microsoft has a BLAST with Azure clouds

Petaflops barrier also broken

Beginner's guide to SSL certificates

SC10 Microsoft is a wannabe in the supercomputing arena, but the software giant is making steady progress among quants and life sciences techies with its Windows HPC Server 2008 R2 and now its Azure cloud services.

At the SC10 supercomputing trade show in New Orleans, Microsoft announced that it has a version of the BLAST, which is essentially a search engine for scanning through large databases of chemical compounds to find matches, running on its Azure cloud.

BLAST is short for Basic Local Alignment Search Tool, and life sciences researchers use it to scan the genetic sequence from one animal (say, a mouse) to see if there is a matching sequence in the human genome.

BLAST is all about speed, as the name suggests, not precision, and biologists use it because it can search large genetic sequences more quickly than a more accurate method, known as the Smith-Waterman algorithm. BLAST was developed at the National Center for Biotechnology Information in 1990, which maintains genomics databases that researchers can use to see which animals and plants have matching nucleotides or gene sequences.

Microsoft loaded up the NCBI database onto Azure and demonstrated at SC10 that it can enable scans against 100 billion protein sequences. This is the kind of capability that only the largest life sciences labs can afford, and interestingly, Microsoft is making access to BLAST running on Azure against its copy of the database available to researchers for free.

The Azure cloud runs the same BLAST executables that those running the application locally run on their own gear, and can be run against the same databases that are available from NCBI. You set up an Azure account, you load and configure the BLAST application, and the worker nodes on the Azure cluster download the appropriate databases from NCBI through FTP to store them on Azure Blob storage.

Azure runs the BLAST scans against the data, partitioning it up across multiple nodes to speed up the searches, and spits out the results into another Azure Blob file.

In one test run by Microsoft, researchers at the Seattle Children's Hospital wanted to check the interactions of proteins against each other using BLAST, and researchers figured that scanning the 10 million proteins they were interested in would take about six years to run on a single computer. So the hospital approached Microsoft's Extreme Computing Group and asked if they could get BLAST running on Azure.

Microsoft plunked BLAST onto two Azure data centers in the US and six data centers in Europe, carving up the search job. The Azure cloud chewed through the job in a week - but not before some server nodes failed and others were taken down for routine maintenance. Microsoft was quite happy to report that the Azure cloud healed around this downtime (as large clusters have to do) and work was not lost.

While the use of BLAST and the NCBI databases is free on Azure, the Azure capacity is not unless you are a qualified researcher approved by Microsoft, working through its Global Cloud Research Engagement Initiative.

At SC10, Microsoft was also crowing about how the Tsubame 2 supercomputer in Japan (see the most recent Top 500 coverage for more on this machine) not only broke the petaflops barrier, but did so running Windows HPC Server 2008 R2 and Linux.

This hybrid CPU-GPU machine is by far the most powerful Windows-based supercomputer cluster ever assembled, but it is helpful to remember that it has 1,400 nodes with three Nvidia GPU co-processors each. The amount of work that the Windows or Linux side of the machine is doing is important, but the GPUs are doing most of the calculations.

Microsoft also said that it would be rolling out Service Pack 1 for Windows HPC Server 2008 R2 by the end of the year. The update will include features to allow researchers to link local machines running the HPC variant of Microsoft's server stack to Azure.

The exact nature of that integration is not yet known, but El Reg is looking into it. Data movement between local and cloud server nodes is a big, big issue. ®

Intelligent flash storage arrays

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Intel offers ingenious piece of 10TB 3D NAND chippery
The race for next generation flash capacity now on
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
SAVE ME, NASA system builder, from my DEAD WORKSTATION
Anal-retentive hardware nerd in paws-on workstation crisis
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Choosing a cloud hosting partner with confidence
Download Choosing a Cloud Hosting Provider with Confidence to learn more about cloud computing - the new opportunities and new security challenges.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.