Original URL: https://www.theregister.co.uk/2013/04/02/nebula_one_openstack_controller_appliance/

Ex-NASA OpenStackers launch Nebula cloud control freak appliance

Forget OpenStack software disties, says OpenStack co-founder Kemp

By Timothy Prickett Morgan

Posted in Cloud, 2nd April 2013 07:02 GMT

Chris Kemp, the former NASA CTO who helped build the wonking Nebula infrastructure cloud for the US space agency and the techie from the NASA side who spearheaded the development of OpenStack along with Rackspace Hosting, knows about as much about control-freaking clouds with OpenStack as anyone else on the planet – and that's why he founded a company called Nebula that seeks to make private clouds easier to build and operate.

After 18 months of development and just ahead of the rollout of the "Grizzly" release of OpenStack, Kemp's Nebula is ready to bring its OpenStack control-freak appliance to market. The machine, which Kemp calls "the cloud computer" and says is a "completely new kind of computing system," has one simple purpose: to make a private cloud something that you plug in and turn on rather than build.

The Nebula One Cloud Controller, technically known as the CTR-1500, is a bit more dense and more capable than the prototype that Kemp was showing off when he jumped ship from NASA to start Nebula in July 2011. The production machine does not push the scalability limits inherent in the Nebula One design, but rather starts out with a fairly modest private-cloud setup and gives Nebula a chance to ramp up its sales and support organization to meet demand for ever-larger private clouds based on OpenStack.

The Nebula One appliance marries a 10 Gigabit Ethernet switch with an x86 server equipped with a hardened and complete OpenStack controller software stack, all sealed up and pretuned to work with specific servers from HP, Dell, and IBM. Just like Cisco Systems has mashed-up network switching and systems management inside of a switch for its Unified Computing System blade servers, Nebula is mashing-up switching and OpenStack into a single controller that can provision server, storage, and networking slices for virtual machines and launch them into production.

The plan a year and a half ago was to try to encourage the use of Nebula One appliances with open source servers based on the Open Compute Project designs, but that desire was a bit ahead of market reality, and more importantly, companies like Facebook do not virtualize their servers to begin with and hence have no use for an OpenStack control freak.

Hyperscale cloud operators may start virtualizing at some point if it helps them with system management or other operational aspects of their iron, but their workloads operate at a very different scale from the rest of the IT community, and they have other ways of dealing with moving workloads around their systems.

Nebula OpenStack appliance prototype

The prototype 4U Nebula OpenStack appliance from July 2011

The Nebula One controller fits in a 2U chassis (twice as dense as planned) and has its 48 10GE ports pointing out of the rear of the chassis instead of in the front as in the original design. Kemp tells El Reg that the company has chosen Intel's Fulcrum ASIC for its switch, although he was cagey about which one.

The appliance also has two Opteron G34 sockets – see, people still use x86 chips from Advanced Micro Devices – and while Nebula isn't being precise about which one, it does say that it has two 1.6GHz processors with 16MB of L3 cache each with 85 watt thermals, and that means the company has chosen the Opteron 6262 HE low-voltage part, if you look at all the Opteron 6300 and 6200 possibilities.

The appliance has 64GB of main memory plus a 32GB SuperCache MLC mSATA flash drive to cache frequently used OpenStack files. The appliance also has a 256GB 2.5-inch MLC solid state drive, an old-fashioned 1TB 7200rpm disk drive to store log files and other infrequently accessed data, and two 650-watt power supplies.

The Nebula One cloud controller appliance

The Nebula One cloud controller appliance

The iron is interesting, of course, because it shows what smart OpenStack people think is sufficient iron to run an OpenStack controller. But the software that Nebula has cooked up is the real important bit, says Kemp.

Nebula starts out with a base Linux operating system and puts OpenStack on it plus the KVM hypervisor to create what it calls the Cosmos cloud operating system. This is not just any old OpenStack, but one which Nebula programmers – many of whom worked on the "Nova" compute controller at NASA and then on the OpenStack project proper – have ginned up with a homegrown set of user interfaces called Resource Manager.

Cosmos is based on the current "Folsom" release of OpenStack, but has backports of features from the Grizzly release already tested and pulled into the Nebula release. The way Nebula does this is that it has preconfigured clouds based on certified hardware from HP, Dell, and IBM all built and running in its labs, and because Nebula's coders are so acquainted with OpenStack, they know when to pull a new feature into testing and then roll it into production.

"The Nebula releases will be completely independent of the OpenStack releases," explains Kemp, because the company wants to keep control of the pace of innovation it rolls out, getting features out as soon as they are ready, not once every six months.

This is precisely the way Red Hat made Enterprise Linux commercial-grade in the early years by backporting features in a future Linux kernel into a current one. Sometimes, you can't wait for the community.

Speaks OpenStack and Amazon APIs

The other interesting bit about the Nebula controller and its Cosmos cloud operating system is that it supports both the OpenStack API set (Folsom with a sprinkling of Grizzly) as well as the Amazon Web Services API stack. Rackspace and other OpenStackers have been trying to back away from the AWS APIs, and that is one of the reasons why Citrix Systems made a break with OpenStack and put all of its weight behind CloudStack, which also supports the AWS APIs.

Here's what this Amazon API support means practically speaking: if you have a workload running on a Nebula One cloud and you want to move it to AWS, you point a tool like RightScale at it and it can capture that VM and spin it up out on the public cloud. Or, vice versa, you can use RightScale to move a VM from AWS to your private cloud. (There's some rejiggering necessary to convert from the Xen to KVM formats, of course.)

To build a cloud, you buy a Nebula One controller and you put twenty server nodes (again, from a selected hardware compatibility list that Kemp says covers the vast majority of enterprise-server buyers) into a rack. You run two 10GE links to each server node, which are used for cloudy server traffic to link the nodes together into a compute and storage pool. To manage the nodes, you pop in a 24-port Gigabit Ethernet switch (there are a few that are certified) that is used by the Nebula One to reach into the server nodes and boss them around.

When you fire up the Nebula One, it reaches out over its embedded switch and provisions KVM on the bare metal, then makes the raw server and storage capacity available for provisioning of virtual machines from the Resource Manager. By the way, that Resource Manager is not based on the Horizon project that is part of OpenStack, but is rather something created by Nebula for itself – and no, it will not be open sourced.

You can run a cloud with a single Nebula One controller, but the system was designed to have multiple controllers for high availability and resiliency, says Kemp. The Cosmos operating system can currently span as many as five controllers in a single OpenStack controller domain and automatically load-balances work across controllers and the five racks of servers attached to them. With those five racks, you can have on the order of 2,500 cores and 5PB of storage, depending on the servers you pick.

A five rack OpenStack cloud controlled by Nebula One

A five-rack OpenStack cloud controlled by Nebula One

Back when Nebula, the company, launched a year and a half ago, Kemp told El Reg that the Cosmos software (which did not have that name at the time) would allow for up to 1,024 controllers to be daisy-chained together for something on the order of 24,576 server nodes and around 300,000 virtual machines under management.

You have to remember that Kemp originally set a very tall order for OpenStack when it launched, which was for it to be able to control freak over one million host systems and control something on the order of 60 million virtual machines. It is going to take some time to get there, clearly.

The Cosmos software also includes a feature called Cloud Edge, which makes a cluster of Nebula One control freaks look like one wonking Layer 3 device to your IP networks, with a bunch of 10GE pipes connecting your private cloud upstream to the network backbone. If you add up all the upstream pipes, you can get around 128Gb/sec of upstream bandwidth out of five racks, which Kemp says is an order of magnitude better than you can get out of clusters built on AWS.

The other thing that Cosmos knows how to do is something that companies are going to be very thrilled to hear about: one-button rolling upgrades of firmware, hypervisor, and OpenStack. Because Nebula is keeping tight control of which servers can be used with the Nebula One controller, it can automate the way that server firmware gets patched on those machines.

So what would you pay for such a cloud control freak appliance? How does $100,000 grab you?

Specifically, for $100,000 you get a Nebula One controller that is licensed to control-freak five server nodes, and each additional node costs somewhere between $5,000 and $10,000 depending on the features you activate in the controller. That $100,000 includes the first year of support, which is 20 per cent per year after that.

"The whole system is less than you will pay an employee to install and manage OpenStack," says Kemp, aware that this is a pretty high sticker price for a 2U appliance server. But then again, NetScaler and other WAN optimizing appliances that do very specific jobs are just as costly.

It is hard to back out what costs what in the Nebula One controller the way Nebula is talking about it, but let's do a little math.

The box includes a 48-port 10GE switch, which is worth somewhere between $12,000 and $20,000, depending on what you want to compare it to. Call it $15,000 plus $3,000 a year for maintenance, just for a guess for the value of the switch inside the controller. That $5,000 to $10,000 per server node under management cost seems high. Even at $5,000 per node, that is worth $25,000 plus another $5,000 for maintenance for the five nodes licensed with the base Nebula One machine.

Back out the network cost, the node license, and their maintenance, and that leaves $52,000 to cover the OpenStack license and its support. (That would be $43,000 for a license and $9,000 for support to make the math work.) Now, build it up to a full rack for 20 nodes, and you are talking about shelling out anywhere from $175,000 to $250,000 to Nebula.

This is still a bit pricey, El Reg reckons, and will rival the cost of very, very fat server configurations in the rack.

But even at that price, with a little discounting and a lot of fast talking about how Nebula One mitigates the risk of using OpenStack and makes it easy to consume, Nebula is going to find some buyers. Quite possibly NASA, for one.

The issue is going to be convincing data centers to let go of their switch preferences, which they are very loath to do. And once it gets some traction, the company is going to have to think about offering lower-priced appliances that can help it go more mainstream before someone else steals the appliance idea.

It is amazing that a switch maker has not embedded OpenStack inside the switch on an x86 coprocessor already, really. ®