IT'S ALIVE! IT'S ALIVE! Google's secretive Omega tech just like LIVING thing

'Biological' signals ripple through massive cluster management monster

Remote control for virtualized desktops

A mind of its own? It seems that way. Just ask open-source Mesos

Already, researchers at the University of California at Berkeley have taken tips from Google to create their own variant called Apache Mesos, which is an open-source Google Borg clone running at large web properties such as Twitter and Airbnb.

However, Mesos is also exhibiting strange behaviors.

"Depending on a combination of things like weights and priorities there's a potential reallocation of resources across and around these jobs that has a compounding affect that can exaggerate these non-determinisms," said Benjamin Hindman, VP of Apache Mesos.

"For some jobs that are good at dealing with these non-determinisms [Omega's behavior] is totally fine. For some of these jobs it can mean much decreased latency to finish."

As stated, emergence leaps out of scale. So, while some engineers might like to be given a completely deterministic system, this may soon prove to be impossible for sufficiently large data centers.

Instead, applications will need to be built with all the reliability features that big business needs – such as transaction guarantees, distributed locking, and coherence – but must be able to be run in a sufficiently distributed manner on systems like Mesos and Borg that can tolerate failures without disrupting overall reliability.

"There's two directions to go out here - one is to go out to the system and try and eliminate the non-determinism, the other is tell the software there's inherent non-determinism and program around that," Hindman said.

"While I'd love to tell someone 'your interface is a completely deterministic user interface' oftentimes the cost of doing that is so prohibitive you couldn't do it. You might be able to do something like that for a very particular type or class of apps [but] if you do it for one class of app it could have really bad effects on one other class of app."

Google sign outside Mountain View headquarters

First to feel the effects ... What Google faces now, Twitter and Facebook will hit soon enough

All applications need to be built to sustain certain failures or slowdowns or obscure latency scenarios, and not fail. Some companies are already doing this, such as Salesforce with its Keystone system.

"Running a million-plus jobs per day – at that point for a given job you might see variation"

The job of a system like Omega, or Borg, or Mesos, or even the revamped MapReduce structure of the YARN resource negotiator in Hadoop version 2, is to hide as much of this as possible from the developer straddling the stack. But some programmers will notice when they deploy it at sufficient scale.

"We've had a lot of experience running YARN at scale now," said Arun Murthy, the founder and architect of Hadoop specialist Hortonworks. "YARN cannot guarantee at scale. We're talking about running a million-plus jobs per day – at that point for a given job you might see variation."

This variation could be the placement of replicas for certain jobs, he said. "Today you might get resources in host 1 and tomorrow in host 82."

By exposing some level of non-determinism to the developer, YARN can give assurances it will make sensible use of compute resources at scale, but on the fringes of sufficiently large clusters, weird things will happen, he admitted.

"It's not an exact science," he says. "What you really need is at very low cost to the end user good performance in the aggregate."

How we learned to stop worrying and embrace chaos

The unpredictable behavior that systems such as Borg, Omega, Mesos, and YARN can display, are a direct result of the number of components within them that all need to jostle for attention.

"My strong belief is that these [emergent properties] manifest in interesting ways in each system as you scale up – I mean, really scale up to 5,000-plus nodes," said Arun Murthy of Hortonworks.

This element of randomness has roots in how we've built low-level components of infrastructure systems in the past.

"There's an emergent behavior that comes out," said Hindman of the Apache Mesos project. "There's all sorts of reasons for that. When it gets to large scale there's a combination of the fact that machine failures now at a large scale can change the property of the job whereas at the smaller scale there wasn't probability of machine failures as much, the second one is there's a lot of other non-determinism in and around the job."

In the past, similar behaviors have been seen in the way garbage collectors work in Java virtual machines, he said. "All of a sudden now you'll get weird things going on like things in the JVM will make those weird behaviors develop... a lot of this stuff starts to creep up at larger scale."

Hindman finds another example in the behavior of any highly concurrent parallel system with numerous cores running hundreds of threads. "You'd see a lot of interestingly similar behaviors. Just based on the Linux thread scheduler, the I/O thread scheduler these types of systems often have a lot of the same non-determinism issues but it's compounded because we have many, many layers of this."

Because systems such as YARN, Omega, Borg, Mesos, and so on, are designed to run thousands and thousands of tasks with vast amounts of network chatter, I/O events, and running apps across time periods that vary from milliseconds to months, the chance of a level of this underlying randomness becoming exposed and having a knock-on effect on high-level tasks is much, much higher.

"At scale everything breaks no matter what you do and you have to deal reasonably cleanly with that and try to hide it from the people actually using your system"

Over the long term, approaches like this will make widely deployed intricate tangles of software much more reliable, because it will force developers to design their apps to effectively deal with the shifting quicksand-like hardware pools that their code lives on top of. By programming applications to be able to deal with failures at this scale, software will become more like biological systems with the redundancy and resiliency that implies.

It reminds us of what Urs Hölzle, Google's senior director of technical infrastructure, remarked a couple of years ago: "At scale everything breaks no matter what you do and you have to deal reasonably cleanly with that and try to hide it from the people actually using your system."

With schedulers such as Borg and Omega, and community contributions from Mesos or YARN, the world is waking up to the problems of scale.

Instead of fighting these non-determinisms and rigidly dictating the behavior of distributed systems, the community has instead created a fleet of tools to coerce this randomness into some semblance of order, and in doing so has figured out a way to turn the randomness and confusion that lurks deep within any large sophisticated data center from a barely seen cloud-downing beast into an asset that focuses apps to be stronger, healthier, and more productive. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Fat fingered geo-block kept Aussies in the dark
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
Turnbull should spare us all airline-magazine-grade cloud hype
Box-hugger is not a dirty word, Minister. Box-huggers make the cloud WORK
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
Do you spend ages wasting time because of a bulging rack?
No more cloud-latency tea breaks for you, users! Get a load of THIS
prev story


Seattle children’s accelerates Citrix login times by 500% with cross-tier insight
Seattle Children’s is a leading research hospital with a large and growing Citrix XenDesktop deployment. See how they used ExtraHop to accelerate launch times.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Why CIOs should rethink endpoint data protection in the age of mobility
Assessing trends in data protection, specifically with respect to mobile devices, BYOD, and remote employees.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.