Feeds

Sun marries Hadoop to Grid Engine

DIY Googleplex

3 Big data security analytics techniques

Sun Microsystems may be in a PR muzzle until sugar daddy Oracle gets permission to buy it from European antitrust regulators, but the coders who maintain Sun's myriad software products are still banging away on their keyboards in an effort to not only look useful to keep their jobs, but be useful.

They just can't engage the IT press to talk about what they are up to. Which is why Sun's blogs come in handy, and in this case, as a means of letting us know about an update to Sun's Grid Engine grid software.

Grid Engine 6.2 update 5 appears to have been launched last week if you reckon the date of a blog posting by Dan Templeton, a staff engineer who works on the grid middleware. Templeton says that with this update, Grid Engine is the first workload manager with support for applications created using the open source Hadoop programming environment hosted over at the Apache Foundation. Instead of having to set up a dedicated Hadoop cluster, you can treat Hadoop like any other application and submit jobs to a Grid Engine grid.

Hadoop is an analog to the distributed programming environment used by Google that was created by rival Yahoo! and taken open source. Hadoop consists of the Hadoop Distributed File System, which is a distributed and fault-tolerant file system, and the MapReduce application parallelization and execution environment that works in conjunction with HDFS. In March 2009, Cloudera put out a commercialized version of Hadoop with enterprise-grade support, and would no doubt argue about Sun's claims. Still, the ability to submit Hadoop jobs to a Grid Engine grid and having it cope with Hadoop jobtrackers and tasktrackers is pretty cool.

The Grid Engine software, which is aware of HDFS, is able to route processing jobs to where the data is already located in the nodes, which speeds up execution of those jobs. (This is a whole lot smarter than starting up a job somewhere on the Hadoop cluster and then trying to move the data over to that node.)

With Grid Engine 6.2 update 5, the job scheduler has also been tweaked so it can allocate jobs to specific types of processors and server configurations if grid applications need certain features - high clock speeds, multiple cores, big caches, lots of main memory, and so forth - to run properly. Templeton says that, for instance, some cache-hungry applications will run in half the time if a job is plunked on four cores spread across four server sockets instead of four cores sharing a single socket.

Now Grid Engine administrators can use a feature called core binding specify the kind of hardware resources they need, and Grid Engine can do its best to allocate a job to them when they are available in the pool.

The update also includes a feature called slotwise preemption, which is a more sophisticated way of allocating resources than just saying job queue A is always subordinate to job queue B; you can say clever things like have no more than four jobs running across queues A and B, and if there is a conflict for resources, queue B always loses.

The update also includes tweaks that make it easier to integrate a Grid Engine setup with Amazon's EC2 compute cloud and to power down unused server nodes in a grid - capabilities that debuted with update 3 of the software last year but which apparently still had some rough edges.

You can plow through the release notes on Sun Grid Engine 6.2 update 5 here and download the software there. Sun offers commercial support for Grid Engine, but pricing was not available at press time.

Sun does not offer support for Hadoop as far as El Reg knows, but Cloudera certainly does for its variant. It won't be long before Oracle-Sun cook up a Cloudera-Grid Engine partnership. Oracle may even snap up Cloudera before lunch some day. ®

SANS - Survey on application security programs

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
BOFH: Oh DO tell us what you think. *CLICK*
$%%&amp Oh dear, we've been cut *CLICK* Well hello *CLICK* You're breaking up...
Bored with trading oil and gold? Why not flog some CLOUD servers?
Chicago Mercantile Exchange plans cloud spot exchange
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.