App Engine: Google's deepest secrets as a service

The software scales. But will the Google rulebook?

Internet Security Threat Report 2014

Er, where's MySQL?

For the average developer, one of the biggest challenges is writing for App Engine's Datastore API. On App Engine, you can't write to the file system. You have to use the Datastore API, the Memcache API, or other services provided by Google, and the Datastore API – based on BigTable – uses a data model vastly different from the relational databases most coders are used to.

"It takes some wrestling from a design standpoint, because it tends to want you to design your back end in a de-normalized sort of way," says Will Merydith, who's using App Engine to build a gaming platform known as SuperKablamo. "Unlike a MySQL environment, where you can do all these really interesting joins, you can't really do that with Datastore."

But you can scale. And with the rise of open source databases such as MongoDB and HBase, which grew out of Google's BigTable research paper, many developers are already adjusting to the non-relational mindset. "Two or three years ago, people said: 'Moving off relational? That's a pretty high hurdle. I don't think that's going to happen'. But now it's definitely happening," says Dwight Merriman, the CEO and cofounder of 10gen, MongoDB's chief steward.

"There is a pretty big hurdle to clear," he says. "Everybody learned relational in school and there's a lot of tooling around it that already exists and there's a lot of legacy code that uses it. ... But what has helped people get over the hurdle is the scaling imperative. Computer architectures are changing, and in the cloud we need horizontal scaling. Really, there's no choice."

Sean Lynch believes that if you've used MongoDB or HBase, you'll take to the Datastore API rather easily. "While it is close to a couple of competitors out there in terms of functionality, it's not identical, so there are things that would be specific to App Engine when you're using something like the Datastore," he says. "But it wouldn't be a big conceptual leap if you were moving from MongoDB or HBase to App Engine. The model should feel very much the same."

But even if you're steeped in this sort of non-relational data model, App Engine takes some getting used to. You can't throw any old code onto App Engine. You can use Python and many languages that compile to Java byte code, including Java, JRuby, Jython, Scala, and Groovy. And now there's an SDK for Go. But that's it. And you can only use certain Python libraries and only certain Java byte code languages, libraries, and frameworks.

What's more, App Engine restricts you to a tight sandbox. Your application can only access other net machines through App Engine's URL fetch, email, and XMPP services. Other machines can only connect to your application via HTTP requests on the standard ports. And an application code only runs in response to a web request, a queued task, or a scheduled task. An app must return a response to a web request within 30 seconds, while tasks (queued or scheduled) have ten minutes to complete.

These restrictions are in place, Lynch says, so that applications can scale, but also for security reasons. "There are some limitations around how long the code will run, and you can't use sockets. If you're used to just dropping in some code to call some socket somewhere, that may not work," he says. "But this is a balancing act. Do you give as much flexibility as possible or do you bake in security from the get-go? It's much easier to build in the security from the base layer and build from there, than it is to try and add the security layer later.

"The other nice thing is that we've been able to scale up over the last couple of years with a very high confidence that things aren't affecting each other when they're scaling. When they're moving around, there are not ways you could touch someone else's memory."

He adds that, as this confidence builds, Google is "peeling back" some restrictions. With App Engine version 1.4.3, it added the ability to do concurrent requests with Java applications (concurrency for Python is on the way). And in version 1.5.0, it introduced "Backends" – long-running, high-memory instances. These have a 24-hour request deadline, and they can use between 128MB and 1GB of memory and a proportional amount of CPU power.

Dwight Merriman

Dwight Merriman

Previously, says developer Jeff Schnitzer, he couldn't run an in-memory index on the service. "You have these little instances that are limited to about 100MB of RAM, and you don't really have a lot of control over how many of them are, and you can't individually address them. For a whole variety of reasons, you just don't have access to large quantities of RAM," he explains. But with Backends, such an index is possible.

The added catch is that – at least for now – Backends are too expensive for Schnitzer's purposes. He has already moved his in-memory index to Amazon. For Schnitzer, at least, App Engine's limitations ensure that applications often spill over into other clouds – or local servers. "I think this is something that anyone who spends a lot of time with app engine realizes," he says. "App Engine isn't really a complete set of tools. You have to have parts of an app in App Engine and parts in other places."

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
Intel, Cisco and co reveal PLANS to keep tabs on WORLD'S MACHINES
Connecting everything to everything... Er, good idea?
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
prev story


Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.