Feeds

EMC Greenplum Hadoop elephant straddles Cisco iron

Cah. Took them long enough

Application security programs and practises

Well, that took long enough. Cisco Systems and the Greenplum big data unit of server partner EMC have finally gotten together and put the Greenplum wares on Cisco's Unified Computing System servers.

In a blog posting, Raghunath Nambiar, an architect at Cisco's Server Access and Virtualization Technology Group, reveals that the two partners in the Virtual Computing Environment Company has circled back and are now offering pre-configured Hadoop stacks that marry Cisco's C-Series rack servers and Greenplum's eponymous Greenplum MR Hadoop distribution.

Greenplum doesn't like to talk about the hardware its data warehousing and Hadoop clusters run upon, mainly because EMC, as an independent disk array maker and the owner of server virtualization juggernaut VMware, has to position itself as Switzerland in the server racket. Before it was acquired by EMC in July 2010 for an undisclosed sum, Greenplum had run its heavily customized implementation of the PostgreSQL database, which was parallelized and juiced to run data warehouse clusters, on Sun Fire x86 servers from Sun Microsystems. This was a good choice at the time, given the large amount of disk capacity that Sun had crammed onto its Opteron and Xeon servers, but a bad choice in the long term because database rival Oracle ate Sun. In the wake of the Sun acquisition, Greenplum has certified its code to run on Dell, Hewlett-Packard, and Huawei Technologies x86 servers and OEMs this iron from those companies, depending on what customers want.

EMC did not, interestingly enough, plunk the Greenplum Modular Data Computing Appliance data warehouse or its Hadoop appliance, which is actually based on a rebadged Hadoop stack from MapR Technologies, on the Vblock server-storage clusters it cooked up with Cisco to chase server virtualization and private cloud business in data centers and now virtual desktops. While the B Series blade servers in the UCS family may not be suitable for Greenplum workloads, the C Series rack servers could certainly be configured in a Vblock by EMC and Cisco to run this Greenplum code, but were not.

Part of the problem was that Hadoop doesn't use external storage, so there would be no EMC iron in such a Vblock. It is very likely that EMC and Cisco were waiting for Cisco to get a little more traction in the server racket – Cisco's server business now has more than 10,000 customers and a $1bn annual revenue run rate that will probably nearly double in the next year – before committing the Greenplum wares to the UCS platform.

According to Nambiar, the fully integrated Cisco-EMC stack takes Cisco's UCS C Series rack servers and its UCS 6200 converged server-storage 10GE switches and fabric interconnects and configures up the Greenplum MR Hadoop distro to run on the boxes. (This Hadoop distro is MapR's M5 Hadoop distribution with the names changed.) The setups start at a single rack and can be expanded to cover multiple racks. The UCS 6200 switch links into UCS 2200 fabric extenders, and according to the reference architecture (PDF), the UCS C210 M2 server is the workhorse that Cisco and EMC have chosen to run Hadoop. The C210 M2 server was announced in March 2010 and is a two-socket box that uses Intel's six-core Xeon 5600 processors and will no doubt be replaced by a new machine using Intel's "Sandy Bridge-EP" Xeon E5 chip. The C210 M2 can support up to 192GB of DDR3 main memory and has room for 16 2.5-inch disk drives and one or two RAID disk controllers.

Cisco UCS Greenplum Hadoop stack

The Cisco UCS-Greenplum Hadoop stack (click to enlarge)

In a single-rack configuration, the Greenplum MR-UCS stack has two 48-port UCS 6248UP fabric interconnects and two 2232PP 10GE fabric extenders. These link down into 16 of the C210 M2 servers, which have 96GB of main memory and 16 1TB disk drives, an LSI MegaRAID 9261-8i disk controller, and a Cisco UCS P81F virtual interface card that presents two 10GE ports to the fabric extenders. Cisco is dropping in the six-core Xeon X5670 processors, which run at 2.93GHz. Each rack has 192 cores, 256TB of raw storage capacity, and up to 350TB of usable Hadoop capacity with three-way data replication across the nodes and data compression turned on. The nodes are configured with Red Hat Enterprise Linux Standard Edition. ®

Bridging the IT gap between rising business demands and ageing tools

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
prev story

Whitepapers

Top three mobile application threats
Prevent sensitive data leakage over insecure channels or stolen mobile devices.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.