Feeds

'Red Hat for stats' goes toe-to-toe with SAS

Analyze this

5 things you didn’t know about cloud backup

Revolution Analytics – the company that launched last year to be the "Red Hat for stats", providing an extended version of the open source R programming language and runtime – is going directly after analytics juggernaut SAS Institute with its latest release: R Enterprise 4.2.

With the updated release, Revolution's Enterprise R can read and write data in the proprietary data format used by SAS tools, which has been around for decades and which is called the SAS File Format. Jeff Erhardt, who was a heavy R user when he worked at chip makers Advanced Micro Devices and Spansion and who has been chief operating officer at Revolution since the company came out with its open core version of R last June, says that the file format has allowed SAS to effectively lock customers into using its tools.

By supporting the SAS file format natively, customers can continue to keep their data in the same format as they are accustomed to, but can add new users to do new analytics work using the R Enterprise tool, bypassing the need to buy more SAS licenses.

Customers who move to the R Enterprise tool can also convert the SAS files to a binary "big data" format created by the company called XDF, which is loosely based on NoSQL database principles. This XDF format, which was announced last August with R Enterprise 4.0. That 4.0 code also had better multithreading capabilities than the earlier releases of the R Enterprise engine, allowing it to make better use of processor cores and threads than the open source version of the R engine.

The update last summer also included a clustering feature based on remote procedure call (RPC) to cluster multiple servers together to parallelize the R engine and therefore speed up data crunching for analytics work. "You'll be able to get the same results in SAS," boasts Erhardt, "but now we can do it much faster and at a fraction of the cost."

Erhardt says that on tests that Revolution Analytics has done pitting its souped up R engine (the components of which are not available as open source code) against the SAS tools, running linear regressions, logistic regressions, and cross tabulations, the supported R Enterprise licenses cost no more than half what it would cost to buy SAS and would deliver about twice the speed on a given set of hardware. In some cases, Erhardt claims the performance difference in favor of its R Enterprise engine can be an order of magnitude, and that it runs better on commodity x64 iron.

SAS has been running its code on Hewlett-Packard's Neoview data warehouse appliance, which is based on Intel's Itanium processors and which HP euthanizedtwo weeks ago. HP only had a few dozen customers using this Neoview appliance, which was based on the Integrity systems running the NonStop kernel and parallel database.That product was discontinued a week after HP and Microsoft launched database appliances based on HP's x64-based ProLiant servers and Microsoft's Windows Server 2008 and SQL Server 2008 database.

The core SAS 9.X Foundation tools can be deployed on AIX, HP-UX, Solaris, Linux, Windows, and OpenVMS. Thus far, HP, Microsoft, and SAS have not announced a new appliance based on the EDW or related Windows-based data warehousing appliances, but this could be in the works.

In any event, Revolution Analytics is going after SAS both directly by allowing its R implementation to use SAS files and indirectly through a deal called the SAS to R Challenge. Under this deal, between now and March 31, Revolution R will convert "representative SAS code" to R code free of charge and run it against SAS data to demonstrate "terabyte-class data analyses" to prove that its code is faster at crunching data than the SAS original. You can apply to the challenge here.

The R Enterprise 4.2 update is now fully capable on both Windows and Linux. In the prior releases, the big data XDF format, which is wrapped up in a feature called RevoScale R, was only available on Windows boxes, and the Web services integration to hook data analytics into other applications was only available on Linux. R Enterprise runs on 32-bit and 64-bit Windows XP and 7 desktops as well as servers and on Red Hat Enterprise Linux 5; RHE 6 is not yet supported.

R Enterprise 4.2 is available now, and costs $1,000 per workstation and $25,000 per server (that price is for a two-socket machine using six-core processors). The server version has the Web services and clustering features. Prior to the addition of the XDF big data option, the server version cost $15,000 per machine, so the NoSQL-ish format costs $10,000.

There are over two million R users worldwide, and over 2,500 open source plug ins have been created for the open source R engine by academics, quants, and others. There is a big installed base of users who want more scalability than the open source R can deliver, and now Revolution Analytics needs to get some bigtime partners to help it push its wares against SAS and IBM. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
'Stop dissing Google or quit': OK, I quit, says Code Club co-founder
And now a message from our sponsors: 'STFU or else'
Microsoft boots 1,500 dodgy apps from the Windows Store
DEVELOPERS! DEVELOPERS! DEVELOPERS! Naughty, misleading developers!
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Best practices for enterprise data
Discussing how technology providers have innovated in order to solve new challenges, creating a new framework for enterprise data.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Advanced data protection for your virtualized environments
Find a natural fit for optimizing protection for the often resource-constrained data protection process found in virtual environments.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?