AMD, Oracle tag-team on GPU acceleration for Java apps

OpenJDK meets OpenCL with Project Sumatra

Choosing a cloud hosting partner with confidence

OpenWorld 2012 The offloading loading of calculations from CPUs to external accelerators such as GPU coprocessors is not just something that is restricted to supercomputer applications. Anything with lots of calculation that can exploit parallelism is a candidate for acceleration, and that means Java applications, not just Fortran or C++ code.

There are a number of different ways that Java applications and the Java virtual machine can be tweaked to exploit the parallelism inherent in GPU coprocessors, such as those based on FirePro GPUs from Advanced Micro Devices or Tesla GPUs from Nvidia, and potentially parallel x86 Xeon Phi coprocessors from Intel.

And as part of new Project Sumatra, announced today at the JavaOne community event hosted by Oracle in San Francisco, Larry & Company is teaming up with AMD to put the software functionality to offload inside of the Java Virtual Machine itself rather than using a two-step conversion and dispatch process that AMD has worked on until now with its own Project Aparapi.

Gary Frost, the technical lead at AMD for Project Aparapi, explained to El Reg in early 2010 that the company wanted to make it easier for Java applications to take advantage of the enormous calculation capabilities of GPUs without having to become OpenCL programmers themselves.

Coincidentally, the Aparapi project was founded just after Oracle had bought Sun Microsystems and had taken control of the stewardship of the Java programming language. The source code for Aparapi was open sourced in September 2011, and as Frost explained it at the time, offloading code to a GPU using OpenCL was not a natural act at all.

"At the time we were beginning to see Java bindings for OpenCL and CUDA (JOCL, JOpenCL and JCUDA), but most of these provided JNI wrappers around the original OpenCL or CUDA C based APIs and tended to force Java developers to do very un-Java-like things to their code," Frost wrote.

"Furthermore, coding a simple data parallel code fragment using these bindings involved creating a Kernel (in a somewhat alien C99 based syntax; exposing pointers, vector types and scary memory models) and then writing a slew of Java code to initialize the device, create data buffers, compile the OpenCL code, bind arguments to the compiled code, explicitly send buffers to the device, execute the code, and explicitly transfer buffers back again."

Performance boost

You program in Java to get away from all that hardware, so it kind of defeats the purpose. Project Aparapi put hints to where data parallelism exist in the applications, and then took Java bytecodes and converted them at runtime to OpenCL routines so they could automagically be dispatched to an AMD or Nvidia GPU that was speaking OpenCL.

Pacific Northwest National Labs, one of the big US Department of Energy supercomputer facilities, was able to get on the order of a 60X performance boost on certain Java codes when a GPU was present, so the benefits were pretty substantial.

With Project Sumatra, Oracle and AMD want to do away with having an external library and conversion process between Java and OpenCL, Frost tells El Reg. Instead, the idea is to take advantage of the data structures within the OpenJDK implementation of the Java tools and let the Java virtual machine generate and compile the OpenCL code itself based on hints in the code.

This is precisely how CUDA tells compilers when they might be able to exploit parallelism for Tesla GPU coprocessors as do Intel compilers for its Xeon Phi coprocessors when they are compiling Fortran or C++ applications on CPUs.

If not now, when?

Project Sumatra begins the process of having Oracle, AMD, and other interested Java contributors to figure out how this might be accomplished, and at what point in the OpenJDK release schedule. This is not something that is determined by AMD, which is committing programmers and any smarts and code it got from Project Aparapi to the cause.

The company most wants to ensure that its on-chip and discrete GPUs are able to accelerate Java applications, and it is particularly interesting to contemplate using "Llano" and "Trinity" Fusion APU chips being plunked into low-powered Java servers. But ultimately, the whole point is to make this transparent to users.

"The HotSpot compiler will now have the capability to compile code for the GPU," explains Frost. "We don't have to target a particular device because the JVM is making the decision at runtime."

If there isn't a coprocessor present that can accelerate the code, then the JVM knows to throw it at the CPU. The beauty is that you don't have to keep two different sets of code or do bytecode conversions. Well, that's the theory of Project Sumatra. The code is not even started yet, much less done.

The data parallelism hints that will be added to Java, which are being developed under Project Lambda for multicore central processors, are expected to be used to extend parallelism out to GPUs (and maybe Xeon Phis) through OpenJDK.

Java 8 is expected around the middle of next year, according to Frost, and so this GPU offload functionality will probably not make it there. But it could come out with Java 9, or be an update to Java 8 at some point in between. That's really up to the OpenJDK community, which is working on a completely open source and GPL-licensed implementation of Java. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
The cloud that goes puff: Seagate Central home NAS woes
4TB of home storage is great, until you wake up to a dead device
Fat fingered geo-block kept Aussies in the dark
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Intel offers ingenious piece of 10TB 3D NAND chippery
The race for next generation flash capacity now on
Want to STUFF Facebook with blatant ADVERTISING? Fine! But you must PAY
Pony up or push off, Zuck tells social marketeers
Oi, Europe! Tell US feds to GTFO of our servers, say Microsoft and pals
By writing a really angry letter about how it's harming our cloud business, ta
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
prev story


Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Managing SSL certificates with ease
The lack of operational efficiencies and compliance pitfalls associated with poor SSL certificate management, and how the right SSL certificate management tool can help.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.