Feeds

Oracle tucks R stats language into database

R-acle 11g, Quant Edition

Build a business case: developing custom apps

Relational database juggernaut Oracle has embedded the R programming language used by more than 2 million statisticians and quants the world over into its 11g relational database. Call it R-acle 11g, Quant Edition.

R, of course, is the open source statistical analysis programming language and is also the name of the runtime engine for that language. R is a bit like the Red Hat for stats, with its main competitors being the closed source analytic tools from SAS Institute and IBM's SPSS unit, among others. The R language was created in 1996 by Ross Ihaka and Robert Gentleman, two stats professors from the University of Auckland in New Zealand.

Nearly two years ago, Revolution Analytics burst on the scene with an effort to commercialize R and its runtime engine, as well as to do proprietary extensions that allowed it to scale across bigger iron than the open source implementation. Since that time, Revolution Analytics has upgraded its Enterprise R so it can read and write data natively in the SAS file format and has parallelized R so it can run on the nodes in a Hadoop cluster, doing statistical analysis on each node's data sets and then reducing them down to a final answer.

Oracle is not doing anything like this, and it certainly is not rolling up its own distribution of R and providing tech support and tweaks to it, as it has done with Red Hat's Enterprise Linux operating system and the Xen hypervisor. That's not saying that Oracle won't ever make its own R-acle distribution someday, or even acquire Revolution Analytics, if it thinks its innovations with R are important enough to want to control.

What Oracle is doing is a bit simpler, and will nonetheless be useful for many Oracle database shops. Advanced Analytics, as the R tools are called, is a new option for the Oracle 11g R2 database.

In the past, Oracle sold a data mining suite as an add-on to its eponymous database, called Oracle Data Mining, for $23,000 per processor core. It had about a dozen data mining routines. The Advanced Analytics add-on that Oracle is now shipping is a superset of this code, and now includes a version of the R programming language and runtime. The is the open source version with no proprietary extensions, George Lumpkin, vice president of product development for data warehousing at Oracle, tells El Reg.

As it turns out, Oracle had already embedded a broad set of statistical algorithms, coded in SQL, inside of the Oracle 11g database. And with the Advanced Analytics add-on, quants working from the R client on their desktops can run their analyses and where possible, an R function will invoke one of these embedded SQL functions to do the same calculations on the data stored in the Oracle database.

For those stat algorithms that can't be invoked with SQL, Oracle has put an "embedded R" engine in the database tier and they run inside of this engine. This engine understands the parallel nature of Oracle RAC and Exadata database clusters and can chew on data across multiple nodes then present summary data back to the quant sitting at an R client console.

"What the statisticians want is to not know the database is there," says Lumpkin. "We are taking the scalability of the database and making it transparent."

Moreover, once you have statistical algorithms coded up in R, any program that runs against the Oracle database can invoke that code and run it as well. All you have to call it, and the R will come running.

R-iding an elephant

The Advanced Analytics add-on for Oracle 11g is not the only R product that Oracle is distributing and supporting. In conjunction with its Big Data Appliance, launched back in October 2010, and more thoroughly fleshed out in January of this year, includes a little something called the R Connector for Hadoop, which has hooks to let R talk to the HDFS and NoSQL (BerkeleyDB) data stores that underpin the Cloudera CDH3 distribution Oracle is putting on its x86 server cluster (similar to but not the same as the Exadata database machine). The set of connectors, including the R connector, costs $2,000 per core used on the Hadoop platform.

Dave Rich, the new CEO at Revolution Analytics who just joined from the analytics unit of Accenture, didn't think the Oracle approach to R would have an adverse impact on his business. "There's plenty of room in the market, and if anything, it helps us," Rich tells El Reg. "It legitimizes R as enterprise-class, and raises all ships."

Rich added that many customers are leery of becoming a one-vendor shop and want alternatives. Oracle would argue just the opposite, as its engineered systems are designed to work best with an Oracle stack tuned to work better together than any alternatives that might plug into the stack.

Oracle, says Rich, had to add R functionality because IBM's Netezza and Teradata's eponymous appliances have it, and there is still a possibility that Oracle could partner with Revolution Analytics, much as it has with Cloudera for its Hadoop distro. ®

Boost IT visibility and business value

More from The Register

next story
NO MORE ALL CAPS and other pleasures of Visual Studio 14
Unpicking a packed preview that breaks down ASP.NET
KDE releases ice-cream coloured Plasma 5 just in time for summer
Melty but refreshing - popular rival to Mint's Cinnamon's still a work in progress
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
Another day, another Firefox: Version 31 is upon us ALREADY
Web devs, Mozilla really wants you to like this one
Put down that Oracle database patch: It could cost $23,000 per CPU
On-by-default INMEMORY tech a boon for developers ... as long as they can afford it
Mozilla keeps its Beard, hopes anti-gay marriage troubles are now over
Plenty on new CEO's todo list – starting with Firefox's slipping grasp
Apple: We'll unleash OS X Yosemite beta on the MASSES on 24 July
Starting today, regular fanbois will be guinea pigs, it tells Reg
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.