Feeds

Databases in academia

University research isn't always up on the latest in business IT

  • alert
  • submit to reddit

Intelligent flash storage arrays

Last week I was at Cambridge, learning what Henslow taught Darwin (Kohn, Murrell, Parker and Whitehorn, Nature, vol. 436, 4 August 2005, p643 – available online if you subscribe/register).

Henslow, elected Professor of Botany at Cambridge in 1825, was a careful scientist, the first university lecturer to illustrate his lectures (yes, even before PowerPoint), and a creationist who investigated the variation within species in order to show that species were created as fundamentally stable things that just varied widely in response to conditions.

Darwin was his pupil (Henslow helped arrange for Darwin’s presence on the Beagle), but Darwin made the intellectual leap that allowed him to interpret Henslow’s records of variation - not as evidence of a fixed set of created species with variations, but as evidence of the evolution of new species in action.

Why was I there representing Reg Developer? Well, John Parker’s research establishing exactly what Henslow was doing and its importance to Darwin’s work was assisted by Mark Whitehorn, Reg Developer columnist and database expert, who got his PhD with Parker many years ago.

Shows John Parker with Henslow samples

The research team was cross-disciplinary in the first place – it included David Kohn, a historian from Drew University in New Jersey, USA (who “went white” when he learnt what Henslow had been doing, since he had to rewrite a chunk of his book, yet to be published, on Darwin); Gina Murrell from the Cambridge University Herbarium; as well as Parker, who is from the Cambridge University Botanic Garden.

However, it was largely chance that Mark was around to point out that correlating Henslow’s plant collections with the time of collection, the people involved, Darwin’s published work and so on using a card index, was woefully inefficient. He designed a database to hold all the information available from Henslow’s collections (found in sheds and attics around Cambridge, as I remember it) and advised and assisted with the extensive data cleansing needed.

He chose Microsoft SQL Server (although he says any reasonable relational database would have done) to store the data, because he considers its query and analysis facilities to be unparalleled today – and he used SQL Server 2005 in its beta incarnation, simply because it made the management of the database and analysis very much easier than with the previous version. And, the research team’s enthusiasm for the way they could now ask questions of their data and get immediate answers and visualisations was palpable.

Shows Henslow tree-planting in Cambridge.

Of course, Henslow’s sheets of paper with collections of plants stuck to them, illustrating variations within a single species, is also a database of sorts. These days, we’d photograph the plants and store them in an electronic database as an extended datatype (although whether recreating the database from a set of CDs in a box in a cupboard some 150 years later would be as feasible as recreating Henslow’s work is moot). But perhaps we wouldn’t.

Although computers are widely used in theoretical physics and such research, the tools taken as routine in business are being overlooked in academia – if Mark hadn’t taken a PhD with John Parker and then moved into databases (he’s in the Department of Applied Computing at the University of Dundee) this research would have been based on shuffling index cards in a card index box (or, at best, on something like a spreadsheet).

Makes you think. And one thing it makes me think is that there are still unexplored opportunities for database specialists out there. And, frankly, 20 years or more after James Martin first excited me with the potential of Relational Databases, that rather surprises me.

Photographs by David Norfolk, who is also the author of IT Governance, published by Thorogood. More details here.

Intelligent flash storage arrays

More from The Register

next story
PEAK APPLE: iOS 8 is least popular Cupertino mobile OS in all of HUMAN HISTORY
'Nerd release' finally staggers past 50 per cent adoption
Microsoft to bake Skype into IE, without plugins
Redmond thinks the Object Real-Time Communications API for WebRTC is ready to roll
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
Mozilla: Spidermonkey ATE Apple's JavaScriptCore, THRASHED Google V8
Moz man claims the win on rivals' own benchmarks
Yes, Virginia, there IS a W3C HTML5 standard – as of now, that is
You asked for it! You begged for it! Then you gave up! And now it's HERE!
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
Ubuntu 14.10 tries pulling a Steve Ballmer on cloudy offerings
Oi, Windows, centOS and openSUSE – behave, we're all friends here
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
Simplify SSL certificate management across the enterprise
Simple steps to take control of SSL across the enterprise, and recommendations for a management platform for full visibility and single-point of control for these Certificates.