Feeds

Open source R in commercial Revolution

Red Hat for stats

Boost IT visibility and business value

Put on your eye patch and get out your parrot. The open source R programming language for statistical analysis and graphics is getting a commercial sponsor. What Red Hat did for Linux, Revolution Analytics wants to do for R, and it wants to use the open source subscription model to take on SAS Institute, SPSS (now part of IBM), and others who have been the market leaders (in terms of money) for statistical analysis for several decades.

While IT shops don't know about R, plenty of people have been using it for more than a decade to do statistical predictive analysis against all kinds of data sets and produce graphics for that analysis in a wide range of fields, including quants in financial services companies and researchers in pharmaceutical companies trying to sift new drugs from countless possibilities.

The R language was created in 1996 by Ross Ihaka and Robert Gentleman, two stat professors from the University of Auckland in New Zealand who are still core members of the R development team. In January 2008, Intel Capital kicked an undisclosed amount of money to Revolution's kitty to kick start the effort to commercialize R, which has over 2,500 plug-ins to cover all kinds of data sets and statistical analysis techniques peculiar to different industries. Last October, North Bridge Venture Partners and Intel Capital put another $9m in the war chest for Revolution and hired Norman Nie, one of the co-founders of SPSS back in 1967 and a designer of its predictive analytics software, to be the company's chief executive officer.

David Champagne, who was the principal architect and engineer at SPSS, is chief technology officer at Revolution, and David Smith, who is a statistician with a degree from the University of Adelaide, South Australia, is the head of marketing at the company. Smith worked on the closed-source S statistics programming language (now owned by Tibco Software) and literally wrote the book on how to use its open source offspring, R. ("Offspring" in the sense that Linux is a kind of open source Unix without the high price tag, but different enough not to be compatible).

According to Smith, there are approximately 2 million people who use R. "Anybody who studies statistics uses R in their research," says Smith. That user base includes loads of students and academics as well as researchers across all manner of industries. The quants at financial services companies have taken a particular shining to R, and not just because they are cheap.

Jeff Erhardt, who was a heavy R user when he worked at chip makers Advanced Micro Devices and Spansion and who is chief operating officer at Revolution, says that universities are not teaching SAS and SPSS any more. They are using R, just like proprietary and Unix operating systems have been displaced by Linux in computer science programs.

Revolution Analytics got its start in 2007 as a spinout from a Yale University incubator. For its first two years, the company (which was called Revolution Computing back then) focused on creating a parallel implementation of R, called ParallelR, and selling services for that tweaked version. With the second round of funding, new management was brought in, R co-founder Gentleman was added to the board, and the idea became to offer a full R stack with commercial support, just like Red Hat offers a full Linux stack and makes its money on support subscriptions.

The marketing tactics for the R Enterprise will be much the same, comparing Linux to Unix and proprietary operating systems. Smith says that the commercial-grade support for the R Enterprise stack will be available in a workstation version that costs $2,000, and the parallel version to run on servers will cost $10,000 for each two-socket server in a cluster. That may seem like a lot of dough for a stat and graphics package, but Smith says this is well below half the cost of similar functionality for SPSS or SAS packages.

Revolution is going to do more than certify applications and set up a tech support line to justify that money. Smith says that there are a number of problems with R that need to be addressed to help it go more mainstream. For one thing, he says that while R has a number of different graphical interfaces available, it is still fundamentally driven through a command line interface.

The R engine also does not scale well because it is memory bound and therefore can only work on relatively small data sets. And it has not had a corporate focal point for development. So Revolution is positioning itself to be that focus, and it will be putting out a development roadmap that includes a thin client interface and a "big data" engine that offers many orders of magnitude in speed as well as the ability to chew on terabyte-sized data sets.

Open source purists probably won't be all too happy to learn that Revolution is going to be employing an "open core" strategy, which means the core R programs will remain open source and be given tech support under a license model, but the key add-ons that make R more scalable will be closed source and sold under a separate license fee. Because most of those 2,500 add-ons for R were built by academics and Revolution wants to supplant SPSS and SAS as the tools used by students, Revolution will be giving the full single-user version of the R Enterprise stack away for free to academics. ®

The essential guide to IT transformation

More from The Register

next story
Munich considers dumping Linux for ... GULP ... Windows!
Give a penguinista a hug, the Outlook's not good for open source's poster child
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Intel's Raspberry Pi rival Galileo can now run Windows
Behold the Internet of Things. Wintel Things
Microsoft cries UNINSTALL in the wake of Blue Screens of Death™
Cache crash causes contained choloric calamity
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux kernel devs made to finger their dongles before contributing code
Two-factor auth enabled for Kernel.org repositories
Time to move away from Windows 7 ... whoa, whoa, who said anything about Windows 8?
Start migrating now to avoid another XPocalypse – Gartner
prev story

Whitepapers

5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.