Feeds

UV 2: RETURN of the 'Big Brain'. This time, it's affordable

Hefty loads bursting out of your box? Try this

Protecting users from Firesheep and other Sidejacking attacks with SSL

Counting up the cores and threads

The UV 1000 could span as much as 2,560 cores and 5,120 threads using the ten-core E7 chips, but the Linux kernel tops out at 4,096 threads at the moment, so that was as far as the thread count could be pushed. That limit has not changed in the Linux kernel, so a full-on 4,096 core UV 2000, if SGI ever built one, would top out at 4,096 cores and would not be able to take advantage of HyperThreading, which provides two virtual threads per core.

Bill Mannel, vice president of product marketing at SGI, tells El Reg that the Linux kernel usually gets all of the features to support the future NUMAlink interconnect between 9 and 12 months ahead of launch, so the current Linux distros already can run on the new machine. Red hat Enterprise Server 6 and SUSE Linux Enterprise Server 11 are supported right out of the box (er, right form the download), and presumably SGI will soon offer certification on Windows Server 2008 R2 and is working on support for the forthcoming Windows Server 2012, due later this year or early next. SGI has been making a big deal in the past year that it can push Windows Server 2008 R2 to its limit of 256 cores without breaking a sweat. It will be interesting to see if Microsoft and SGI will patch Windows to do a better job scaling across UV 2000s and at least try to compete with Linux on this machine.

The UV 2000 has basically twice the cores and supports four times the global shared memory as the UV 1000 it replaces, and the local read latency on a node is 80 nanoseconds on the UV 2000 compared to 130 nanoseconds on the UV 1000. The full read latency from distant nodes is under 1 microsecond for both machines – you have more than double the bandwidth, but you also have four times the nodes to span. The UV 1000 could deliver around 6 teraflops per rack, and the UV 2000 delivers 11 teraflops. (This assumes top bin parts in both cases, presumably.) The machine has an aggregate of 4TB/sec of aggregate I/O bandwidth across its PCI-Express 3.0 slots, too.

Both the UV 1000 and UV 2000 machines can scale out beyond their 128 and 256 node limits (that's the production scalability on the machine). What you do is use InfiniBand to link a bunch of UV blades together in a fat tree configuration, and then use the NUMAlink interconnect to lash those clusters together into a larger cluster that has globally addressable memory but not tightly coupled shared memory. You can do a maximum of 16,384 sockets across 128 racks with such a monster configuration, which would give you around 1.41 petaflops of number-crunching power and 8PB of addressable memory.

Price point

What has SGI excited with the UV 2000 is not just the increased processing and memory scalability, but the lower cost of the machines compared to the UV 1000. A base UV 1000 node with two eight-core Xeon E7s and 32GB of memory cost $50,000, but Mannel says that SGI can put a UV 2000 node with two Xeon E5-4600s and 32GB into the field for $30,000. That's a 40 per cent price cut, and that will go a long way toward expanding the addressable market of the UV machines if all of the other parts of the machine (extra routers and do on) don't add too much to the cost of a configured system.

SGI will be stressing the memory bandwidth and capacity of the UV 2000 compared to big SMP servers and to flash memory arrays. SGI says that a ProLiant DL980 eight-socket from Hewlett-Packard with 80 cores (Xeon E7) running at 2.26GHz and with 1TB of main memory will cost $93,000 and get you around 7.5 gigaflops for every thousand bucks you spend on the machine. A 128-core UV 2000 with 2.6GHz Xeon E5-4600s and 1TB of memory will cost you $98,000, but you will get over 14 gigaflops for every grand you spend. As for the comparison with flash, SGI says that two Dell rack servers with 1.2TB of high-end Fusion-io flash memory will give you a read/write bandwidth in the range of 2.5GB/sec to 3GB/sec at a latency of between 15 and 47 microseconds, but if you instead use four UV 2000 node enclosures (that's eight CPU sockets) equipped with 1TB of memory, you get a read/write bandwidth of around 236GB/sec and a latency of between 100 to 500 nanoseconds. That's 100 times the performance and 35 times better bang for the buck for the UV 2000 nodes, says Mannel. That is also another way of saying that the UV 2000 nodes are three times as expensive, but you'd expect that when comparing main memory and flash devices.

The other sales tack that SGI will be making is to convince customers that they can do multiple jobs inside of a UV 2000 machine, supporting multiple users doing related work.

Running multiple jobs on the UV 2000

Applied Micro X-Gene ARM block diagram (click to enlarge)

For instance, you can put a block of Nvidia Quadro graphics cards in a chassis in the UV 2000 rack along with some Infinite Storage arrays and the UV 2000 enclosures and create a machine that can do all aspects of virtual product design inside the complex. This includes preprocessing, mesh generation, and model decomposition for the design, running solver programs to do the design, and then post processing and visualization to actually see the design. SGI is allowing customers to plug Intel's forthcoming "Knights Corner" MIC x86 coprocessors into the chassis as well as the Tesla family of GPU coprocessors as well, but has not gone so far as to link one Xeon socket to one GPU like other system designs are doing.

SGI is already shipping one UV 2000 system to a customer in the United States (a "well known auto company") and is working on manufacturing a couple more that will soon ship to customers in Europe, including the UK Computational Cosmology Consortium at Cambridge University where physicist Stephen Hawking gets his paycheck.

In addition to the UV 2000 high-end machine, SGI is also kicking out a four-socket, 2U rack server called the UV 20 that is intended to be a development machine for the UV 2000. ®

Website security in corporate America

More from The Register

next story
Hey, Scots. Microsoft's Bing thinks you'll vote NO to independence
World's top Google-finding website calls it for the UK
Phones 4u slips into administration after EE cuts ties with Brit mobe retailer
More than 5,500 jobs could be axed if rescue mission fails
Apple CEO Tim Cook: TV is TERRIBLE and stuck in the 1970s
The iKing thinks telly is far too fiddly and ugly – basically, iTunes
Israeli spies rebel over mass-snooping on innocent Palestinians
'Disciplinary treatment will be sharp and clear' vow spy-chiefs
Huawei ditches new Windows Phone mobe plans, blames poor sales
Giganto mobe firm slams door shut on Microsoft. OH DEAR
Phones 4u website DIES as wounded mobe retailer struggles to stay above water
Founder blames 'ruthless network partners' for implosion
Found inside ISIS terror chap's laptop: CELINE DION tunes
REPORT: Stash of terrorist material found in Syria Dell box
OECD lashes out at tax avoiding globocorps' location-flipping antics
You hear that, Amazon, Google, Microsoft et al?
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Security and trust: The backbone of doing business over the internet
Explores the current state of website security and the contributions Symantec is making to help organizations protect critical data and build trust with customers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.