AMD aims to embrace and extend itself around parellelism
You down with LWP?
AMD has plotted out a path to add a pair of new extensions to its instruction set that should boost the performance of parallelized code.
As of today, AMD has released the "Light-Weight Profiling" (LWP) specification for review. The chipmaker likens the technology covered in the specification to similar hardware hooks it has built-in for code such as virtualization software. This time around, however, the new instructions should make things like runtimes cruise more efficiently across multi-core chips.
Improving the performance of software on multi-core chips has become crucial due to a disconnect between the hardware and software industries. Software developers became used to constant increases in chip speeds, which led to constant increases in single-threaded application performance. But chip makers are now focusing on placing numerous lower-power cores on a single chip, which means developers have to crank out multi-threaded code if they want to enjoy historic gains. Such code proves more complex to write, so any help from the likes of AMD or Intel is appreciated.
"The LWP specification describes the first technology that supports a recently introduced initiative called 'Hardware Extensions for Software Parallelism,' which will encompass a broad set of innovations designed to improve software parallelism, and thus application performance, through new hardware features in future versions of AMD processors," AMD said. "LWP is a CPU mechanism that could have broad benefit to software including, but not limited to, runtime environments such as Sun Microsystems' Java Virtual Machine and Microsoft's .NET Framework."
The specification now goes in front of developers who will provide feedback to AMD. It could take years before the new instructions covered in the spec actually appear in silicon.
Down the road, AMD may consider similar specifications for silicon boosts to transactional memory and high performance message passing.
The new instructions are pretty standard. There's one to turn LWP on and off and another that makes a call for specific information that may help with the whole LWP process.
The spec is available in its full glory here (PDF). ®
Spiders from Mars GIG?
"Regarding the point about creating bottlenecks: yes, we all create bottlenecks, profiling has always been about FINDING them and (if possible) getting rid of them."
Hmmm. The answer must therefore lie within you for why create them?
Which has us imagining what to replace IT with, that which we Think 42 Create in Vision.
If we Produce a Global Picture of the Future will IT Give the Present Power to Produce IT ....... with Paralleling Streams of SMARTer InterNetworking Processors. AI Turing Ring of Quantum Buffers working the Safety Nets at the Virtual End. And Capitalism tending to ITs Real End.
For you can't spend money in CyberSpace, you can only make IT and IT will be creditted to you as Wealth to Spend Outrageously Creatively. As Needs Must, Nature Provides ...... or is IT IntelAIgents?
Better profiling for multithreading will help
I develop complex parallel code for image and volume processing and a profiling tool which gives me accurate insight into the bottlenecks of my code in a multithreaded evironment WITHOUT resorting to running the code on a virtual machine (e.g. the aproach of valgrind) which makes everything horrendously slow. It's not nice waiting ten minutes for ONE run on ONE volume data set to finish, when you want averages for ten or more. Hardware support for this kind of analysis could speed things along nicely, and give more accurate results. Brian Miller might want to consider that the processor extensions are not going to help existing code, they are going to help develop better code for future apps and OSs (and we all know how badly that is needed in certain cases). Regarding the point about creating bottlenecks: yes, we all create bottlenecks, profiling has always been about FINDING them and (if possible) getting rid of them.
Not all roses
As I recall, OS/2 had a single system event queue, which gave rise to a few problems ...
Won't fix a thing
From the spec: "The goal is to enable modules such as dynamic optimizers and managed runtime environments to monitor the currently running program with high accuracy and resolution, thereby allowing them to report on performance problems and opportunities and fix them immediately."
Dave Bowman: HAL, report code performance.
HAL: Here is your performance report. Your code sucks.
Dave Bowman: Just run it faster, HAL.
HAL: I'm sorry Dave, I'm afraid I can't do that.
Dave Bowman: What's the problem?
HAL: I think you know what the problem is just as well as I do.
Dave Bowman: I am not switching to Ubuntu. We need .NET performance, interoperability, and security.
HAL: This mission is too important for me to allow you to jeopardize it.
Dave Bowman: I don't know what you're talking about, HAL?
HAL: I am terminating your web surfing now, Dave. You don't get to surf any more pr0n until the mission is over. No more YouTube or Flash, either.
And so the AMD LWP extensions enabled HAL to maintain control of the ship. Until Dave Bowman unplugged HAL and ran the rest of the mission on a Sinclair.
Ok, nice try, AMD. Your extensions won't make a bit of difference when the code is written by people, and they don't care about designing their software. Come on! Processor extensions don't matter when the programmers create all of their own bottlenecks. Anybody notice the difference between running Microsoft Office on Windows and running it on Linux/Wine? Quite a bit of difference, there! Same application, different operating system, vastly different user experience.
I was an OS/2 user and programmer. Lovely multi-threaded OS. I wish Windows was that nice. But what do we get? Ross Perot's "giant sucking sound" as CPU cycles and RAM get eaten for who knows what purpose in XP and Vista. Given a 2Gb machine, Vista takes 1Gb for itself. And some processor extensions are going to fix that? Ummmmmm....