Huawei unveils bigger iron KunLun server at CeBIT

A bigger splash from big freaking box of a server, with up to 32 CPUs in a rack

big dog little dog, image Shutterstock

Huawei has unveiled a more powerful version of its top-end KunLun server at CeBIT, amongst a raft of other big iron-ish hardware and software announcements.

KunLun is Huawei’s big freaking box of a server, with up to 32 CPUs in a rack. This V5 edition is basically a Xeon SP refresh with NVMe drive support. This is the latest move in Huawei's Skylake refresh of its server line.

The new hardware is intended for the top end niche of the X86 server market, the so-called mission-critical bit, and this scalable box is built from 4 SCE sub-enclosures, each with from 2 to 8 processors and 12TB of DDR4 memory (96 DIMMs), and one CME central management chassis. There are three models, the 9008 with one 8-CPU chassis, the 9016 with two and the 9032 with four. The 9008 and 9016 are upgradable to 32 CPUs.

Huawei_KunLun_three

KunLun 9008 (left), 9016 (middle) and 9032 (right)

These are either Xeon Platinum 8100 or Gold 6100 processors, with up to 768 cores (with 24-core CPUs) and 32TB memory per node.

The system can be physically partitioned into 4-CPU groups or there can be up to 40 x 1-96 CPU core logical partitions per host. There are PCIe expansion slots, 2 x 10GbitE SFP+ ports and 2 x GE RJ45 ports.

The local storage per SCE box is up to 48 x 2.5-inch SAS/SATA disk drives or 40 x 2.5-inch NVMe SSDs. Capacities are not provided in the datasheet. RAID levels of 0,1,10, 5, 50, 6 and 60 are supported. Supported operating systems are RedHat and SuSE Linux, Windows Server and VMware ESX.

huawei

Huawei dunks server triplets in Skylake for a v5 refresh

READ MORE

Huawei says the KunLun v5 has a multi-layer fault-tolerant architecture with fault-tolerant chips, firmware, and OSs. It is fully redundant with no single point of failure. We’re told it delivers 40 per cent higher performance than RISC servers, but with no details.

Why have a 32-CPU rack server instead of 32 individual rack shelf servers? The short answer is that you can group the CPUs and cores into compute units sized to workloads.

If you have a set of variably sized workloads, which you need to run at differing times and in differing combinations, then a single 8, 16 or 32-CPU server which you can flexibly and dynamically modify into appropriately-sized compute units, through physical and logical partitioning, can make good sense.

This doesn’t make it a composable server in the HPE Synergy/Attala/DriveScale/Liqid sense where you can dynamically define compute, memory and storage groupings from resource pools. How does KunLun compare to other suppliers’ 32-socket class servers?

Some other big iron server suppliers

Cisco doesn’t have a server in this class.

There is a Dell m100e 10U blade server chassis which can hold 16 blades, each of which can be a 4-socket system using 22-core Xeon E5-4600 v4 CPUs, meaning a total of 64 CPUs and 1,408 cores.

But this is a pre-Skylake system and there is no centrally-managed, 4-chassis system equivalent to the KunLun. The same appears to be true for Fujitsu with its blade server chassis products

Hitachi had a 12U CB 2500 chassis available in the 2016 era with up to 14 dual-processor server blade modules. There is an 8-socket CB520X blade module with LPAR support. This blade chassis system hasn’t had a Skylake makeover.

HPE has its Integrity rack servers using Itanium processors, and its Superdome Flex which supports 32 Xeon SP sockets and 48TB of memory when 8 x 5U chassis are grouped together.

IBM has, of course, its mainframe systems, and its Power9-based AC922 (Power9 - 22-core) with 2 to 6 Tesla V100 CPUs can be scaled out to a full rack or racks.

Huawei also announced a raft of other new products at its CeBit extravaganza, such as Internet of Vehicles kit, FusionCloud 6.3 full-stack offerings with more than 40 cloud services, and the FusionInsight LibrA converged data warehouse.

Huawei says this uses parallel computing, mixed row-column storage, and vectorized execution technologies to perform association analysis of trillions of data records in seconds. Big iron hardware and software rules here, okay. ®




Biting the hand that feeds IT © 1998–2018