Feeds

Revolution Analytics paints R stats Azure blue

Gooses performance, spans HPC clusters with 6.0 update

HP ProLiant Gen8: Integrated lifecycle automation

Revolution Analytics, aka "Red Hat for stats" – which commercialized the open source R programming language and statistical analysis tool – has now tweaked its R Enterprise stack and pushed out a 6.0 release.

The new R Enterprise 6.0 is based on the R 2.14.2 engine, which is the latest stable release of the open source code, according to David Smith, head of marketing at the company. This code was released on February 29, with the 2.15.0 update just coming out on May 30 and not quite ready for inclusion in R Enterprise. You can see the full release notes for R 2.14.2 here.

The big new feature this edition of the R engine gives to users is a byte compiler that has been added to the engine. Similar to Java byte codes for a Java virtual machine, the byte compiler compiles the interpreted R code down to an intermediate stage before it executes in the R engine, which can speed up the operations by the R interpreter by around 30 per cent, according to Smith. This byte compiler for the R interpreter has no effect whatsoever on any number-crunching that the R engine needs to do, since this is not done by the interpreter but by another part of the engine. So obviously the performance improvements you will see from the new R engine will depend on the nature of the statistical algorithms you are running.

The update also has support for Generalized Linear Models, or GLMs, in stat speak. These include Logistic (Binomial) Poisson, Gamma, and Tweedie models, which are all supported with a high-performance C++ implementation, according to Smith.

A new feature of the R Enterprise 6.0 is integration with IBM's new Platform LSF V8.3 scheduler for HPC grids, which allows for R routines to be parallelized and run on a cluster of x86 iron. Put the two together – GLMs and HPC clusters – and you can get a significant speedup in many cases, according to Smith.

In the case of an insurance company that was doing "tweedie distribution" analysis against 30 million claims using the SAS stats package on a big SMP server, the job took eight hours to run. During beta tests, this customer fired up R Enterprise on an eight-node x86 cluster, used Platform LSF to dispatch work to the nodes from a workstation running R Enterprise and with the nodes running R Enterprise as well, and the job finished in 10 minutes. While the shortening of the time to complete the job is important, what is perhaps more important is the ability to iterate models quickly and improve them because the job runs so much faster. You need to be running Red Hat Enterprise Linux and R Enterprise on server nodes if you want Platform LSF to dispatch work in parallel to them.

The 6.0 release also includes the ability to read SAS and SPSS native file formats directly as well as sucking in raw ASCII text data and information sucked out of relational databases using ODBC to have it analyzed. In the past, R Enterprise had to convert this data to its own XDF NoSQL-like data store, and on data that is constantly changing, this reformatting is a pain in the neck. Now, you can just use the native data sets and, perhaps more importantly, not have to worry about having a license for SAS or SPSS if you have moved off those platforms to open-core R Enterprise tools.

Revolution Analytics already supported the running of its R stack inside of Amazon's EC2 compute clouds, and Smith tells El Reg that the company has "quite a few" customers that run R Enterprise in the cloud, and that all of the proof of concepts that Revolution Analytics does with customers run on EC2 as well.

But not everyone uses EC2. Some people use Microsoft's Azure cloud, and starting with R Enterprise 6.0, you can now fire up R instances on the Azure cloud. At the moment, Revolution Analytics is only supporting the bursting features of Azure, which allows you to dispatch work from inside your firewall to the Microsoft cloud. You cannot run R Enterprise in a standalone fashion on Azure, and you have to have at least one server node running Microsoft's Windows HPC Server to dispatch R work to Azure. You can have more nodes than that in the local cluster, of course, but you need at least one. And this bursting function, which has been in beta testing for the past four months, does not work with either releases of R Enterprise.

R Enterprise comes in two flavors: workstation and server. A workstation edition which is designed for a single user on a single workstation PC costs $1,000 per machine per year for a license. The server edition, which can be used by an unlimited number of end users firing work at the cluster or cloud, costs $30,000 per year for an eight-core x86 server. ®

Reducing security risks from open source software

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.