Feeds

Inside Intel's Haswell: What do 1.4 BEELLION transistors get you?

The brains and the brawn of the next Windows 8 slabs

Top 5 reasons to deploy VMware with Tegile

Order, order, you are out of order, sir!

Each core itself is largely the same as those found in Ivy Bridge processors. Intel has improved the front end - the bit that pulls in the x86 instructions that programs are compiled into before converting them into micro ops, which are the chip’s native instruction set - so it’s better able to anticipate in which direction upcoming branches in the code will take the execution stream, but Intel does this with every new generation of its architecture.

After conversion from x86, the micro ops are juggled into a new order that allows the core’s many instruction-processing engines to kept well fuelled without (hopefully) breaking data dependencies in the original program, which can happen if an out-of-order action changes the value of variable beyond what another instruction was expecting it to be. Haswell has more capacity than its predecessors to sort through the micro ops to see how many can be executed in parallel. It has more core register space for temporary data.

Haswell buffers

Where Ivy Bridge had two, 28-op buffers, one per thread, from which micro ops were routed to free maths units, Haswell cores have one, 56-op buffer with eight output ports rather than six, the better to keep as many micro ops flowing as possible.

The array of available maths units has again been tweaked to accommodate the loading Intel’s modelling of real-world workloads is most likely to require. That’s a moveable feast, of course. New applications and uses may have come into play which the chip engineers didn’t take into account, or program patterns they did anticipate and designed for may have fallen out of fashion.

Doubling the L1 and L2 cache bandwidth, by widening their access ports, and smartening up the core’s ability to cope with cache misses will help here, even though the cache sizes and structure remain unchanged. Meanwhile, Haswell adds new instructions - AVX (Advanced Vector eXtensions) 2 - to help handle multimedia data and the kind of numbers high-performance computing rigs crunch. Intel promises big performance gains in cryptography code, for instance.

Haswell execution units

Another new set of instructions, Haswell’s Transactional Synchronization eXtensions (TSX), help programmers take advantage of the chip’s ability to spot situations when the locks established by one thread to prevent another overwriting its data are not actually necessary. In which case, the overhead of locking and subsequently unlocking the data can be removed by ignoring the locks - technique called ‘lock elision’. With this ability, coders can insert lock code safe in the knowledge that if it’s not actually needed, there will be no performance hit. And they can add more locks without over-complicating their code.

Haswell’s cores form only one small part of the chip’s die. A quad-core Haswell has a surface area of 177mm2, but only a third of that is taken up with those four cores. The remaining two thirds are split roughly half and half between the GPU, and the system logic and caches. Together all these elements comprise 1.4 billion transistors.

Graphics

Haswell’s graphics core comprises generic front-end and back-end, with one or more “slices” in between, each of which contains eight execution units and associated caches and such. It’s the same architecture as Ivy Bridge, but expanded with a greater number of execution units and the addition of a new processing engine - a “Resource Streamer” in the jargon - to do a lot of the set-up work the CPU cores would once have handled. This increases the independence of the GPU, which is running on a separate clock from the cores, don’t forget. It can do more work without requiring the cores to be clocked up. The front end has been beefed up to keep extra slices pumped with data.

Slices are independently power gated, by the way, so they can be shut down if they’re not needed.

Intel currently has three Haswell GPU Variants: the HD 4600, HD 5000, Iris 5100 and Iris Pro 5200. The first contains a single slice; the rest have a second slice, essentially doubling the (undisclosed) number of execution units in the GPU. Slices work on individual groups of pixels on the screen, says Intel.

Haswell GPU

Slicing pixels: Haswell’s GPU architecture

The Iris Pro configuration will come in versions of Haswell that incorporate embedded RAM. The memory is in the chip package but not on the die, and Intel is not saying how much of it there will be. But it does say that the RAM cache is equally accessible to the chip’s cores and GPU through a low latency, high throughput connection. Intel engineers even claim it “enables discrete-class graphics performance”. No wonder Apple is rumoured to be particularly interested in the technology for future MacBook Airs and Pros.

Even without the extra RAM, Haswell’s GPU will be able to support three displays simultaneously through a mix of DisplayPort, HDMI and VGA external links and internal connections to a laptop’s LCD. Resolutions of up to 4096 x 2304 pixels at 24Hz are supported, but you’ll be able to do the slightly lesser 3840 x 2160 at 60Hz, which is 4K x 2K. On the API side, Haswell will support DirectX 11.1, OpenCL 1.2 and OpenGL 4.0.

Movie watchers may appreciate Haswell’s support not only for H.264’s Scalable Video Coding (SVC) feature - no H.265, though - and 4K x 2K. ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
Nexus 7 fandroids tell of salty taste after sucking on Google's Lollipop
Web giant looking into why version 5.0 of Android is crippling older slabs
All aboard the Poo Bus! Ding ding, route Number Two departing
Only another three days of pooing and I can have a ride!
Heyyy! NICE e-bracelet you've got there ... SHAME if someone were to SUBPOENA it
Court pops open cans of worms and whup-ass in Fitbit case
Official: European members prefer to fondle Apple iPads
Only 7 of 50 parliamentarians plump for Samsung Galaxy S
Fujitsu CTO: We'll be 3D-printing tech execs in 15 years
Fleshy techie disses network neutrality, helmet-less motorcyclists
Space Commanders rebel as Elite:Dangerous kills offline mode
Frontier cops an epic kicking in its own forums ahead of December revival
The IT Crowd's internet in a box gets $240k of crowdcash for a cause
'Outernet' project proposes satellite-fuelled 'Lantern' WiFi library for remote areas
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Choosing a cloud hosting partner with confidence
Download Choosing a Cloud Hosting Provider with Confidence to learn more about cloud computing - the new opportunities and new security challenges.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.