Open source R in commercial Revolution

Red Hat for stats

Combat fraud and increase customer satisfaction

Put on your eye patch and get out your parrot. The open source R programming language for statistical analysis and graphics is getting a commercial sponsor. What Red Hat did for Linux, Revolution Analytics wants to do for R, and it wants to use the open source subscription model to take on SAS Institute, SPSS (now part of IBM), and others who have been the market leaders (in terms of money) for statistical analysis for several decades.

While IT shops don't know about R, plenty of people have been using it for more than a decade to do statistical predictive analysis against all kinds of data sets and produce graphics for that analysis in a wide range of fields, including quants in financial services companies and researchers in pharmaceutical companies trying to sift new drugs from countless possibilities.

The R language was created in 1996 by Ross Ihaka and Robert Gentleman, two stat professors from the University of Auckland in New Zealand who are still core members of the R development team. In January 2008, Intel Capital kicked an undisclosed amount of money to Revolution's kitty to kick start the effort to commercialize R, which has over 2,500 plug-ins to cover all kinds of data sets and statistical analysis techniques peculiar to different industries. Last October, North Bridge Venture Partners and Intel Capital put another $9m in the war chest for Revolution and hired Norman Nie, one of the co-founders of SPSS back in 1967 and a designer of its predictive analytics software, to be the company's chief executive officer.

David Champagne, who was the principal architect and engineer at SPSS, is chief technology officer at Revolution, and David Smith, who is a statistician with a degree from the University of Adelaide, South Australia, is the head of marketing at the company. Smith worked on the closed-source S statistics programming language (now owned by Tibco Software) and literally wrote the book on how to use its open source offspring, R. ("Offspring" in the sense that Linux is a kind of open source Unix without the high price tag, but different enough not to be compatible).

According to Smith, there are approximately 2 million people who use R. "Anybody who studies statistics uses R in their research," says Smith. That user base includes loads of students and academics as well as researchers across all manner of industries. The quants at financial services companies have taken a particular shining to R, and not just because they are cheap.

Jeff Erhardt, who was a heavy R user when he worked at chip makers Advanced Micro Devices and Spansion and who is chief operating officer at Revolution, says that universities are not teaching SAS and SPSS any more. They are using R, just like proprietary and Unix operating systems have been displaced by Linux in computer science programs.

Revolution Analytics got its start in 2007 as a spinout from a Yale University incubator. For its first two years, the company (which was called Revolution Computing back then) focused on creating a parallel implementation of R, called ParallelR, and selling services for that tweaked version. With the second round of funding, new management was brought in, R co-founder Gentleman was added to the board, and the idea became to offer a full R stack with commercial support, just like Red Hat offers a full Linux stack and makes its money on support subscriptions.

The marketing tactics for the R Enterprise will be much the same, comparing Linux to Unix and proprietary operating systems. Smith says that the commercial-grade support for the R Enterprise stack will be available in a workstation version that costs $2,000, and the parallel version to run on servers will cost $10,000 for each two-socket server in a cluster. That may seem like a lot of dough for a stat and graphics package, but Smith says this is well below half the cost of similar functionality for SPSS or SAS packages.

Revolution is going to do more than certify applications and set up a tech support line to justify that money. Smith says that there are a number of problems with R that need to be addressed to help it go more mainstream. For one thing, he says that while R has a number of different graphical interfaces available, it is still fundamentally driven through a command line interface.

The R engine also does not scale well because it is memory bound and therefore can only work on relatively small data sets. And it has not had a corporate focal point for development. So Revolution is positioning itself to be that focus, and it will be putting out a development roadmap that includes a thin client interface and a "big data" engine that offers many orders of magnitude in speed as well as the ability to chew on terabyte-sized data sets.

Open source purists probably won't be all too happy to learn that Revolution is going to be employing an "open core" strategy, which means the core R programs will remain open source and be given tech support under a license model, but the key add-ons that make R more scalable will be closed source and sold under a separate license fee. Because most of those 2,500 add-ons for R were built by academics and Revolution wants to supplant SPSS and SAS as the tools used by students, Revolution will be giving the full single-user version of the R Enterprise stack away for free to academics. ®

Combat fraud and increase customer satisfaction

More from The Register

next story
Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
Why HELLO Amazon! You weren't here last time
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Next Windows obsolescence panic is 450 days from … NOW!
The clock is ticking louder for Windows Server 2003 R2 users
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
OpenBSD founder wants to bin buggy OpenSSL library, launches fork
One Heartbleed vuln was too many for Theo de Raadt
Got Windows 8.1 Update yet? Get ready for YET ANOTHER ONE – rumor
Leaker claims big release due this fall as Microsoft herds us into the CLOUD
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
prev story


Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.