Feeds

GCHQ goes Google

Net spies turn to MapReduce

Internet Security Threat Report 2014

Britain's digital spies have turned to Google for help making sense of the floods of data now inundating their powerful computing resources.

GCHQ, the Cheltenham-based signals intelligence agency, is recruiting an expert on MapReduce, the patented number-crunching technique previously behind the dominant web search engine.

The agency's new lead researcher on data mining will be responsible for "developing MapReduce analytics on parallel computing clusters", a job advertisment reveals.

MapReduce was developed by Google to index billions of web pages across its cluster of hundreds of thousands of commodity servers. It breaks up complicated tasks into smaller, easier computing problems that cheap hardware is capable of solving quickly.

Google patented the technique earlier this year, but it remains free for other organisations to adopt via Hadoop, an open source project. Originally described in a 2004 research paper, MapReduce has allowed Google's algorithms to index a rapidly expanding web while keeping costs down.

GCHQ faces similar a challenge as it gathers more and more raw data from internet communications, including email, social networks and VoIP.

"Successful data-driven organisations must be able to process, interpret and rapidly respond to indicators derived from unprecedented volumes of data from disparate information sources," its recruitment advertisement says.

The Register understands that GCHQ now has a cluster of more than 250,000 commodity servers under its Cheltenham "doughnut" building. In recent years it has developed this Google-style infrastructure instead of the very expensive, bespoke supercomputers it used to analyse microwave intercepts during the Cold War.

While spies are planning research on MapReduce, Google has already moved on to BigTable, its new distributed database. ®

Beginner's guide to SSL certificates

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Reducing the cost and complexity of web vulnerability management
How using vulnerability assessments to identify exploitable weaknesses and take corrective action can reduce the risk of hackers finding your site and attacking it.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.