Feeds

Excel ate my DNA

Autoformating black hole

  • alert
  • submit to reddit

Security for virtualized datacentres

Genetic research is being hampered by a smart formatting function in Excel, according to US researchers.

The problem, which can cause medically important genes to be hidden from view, is widespread, and has affected some public databases, including the gene expression data on the NCBI LocusLink database in the US, the researchers say.

Excel is widely used in genetic research to process microarray data. A microarray chip detects amounts of protein produced from thousands of different genes, enabling researchers to see which particular gene is being expressed in a sample of diseased tissue, for example.

The errors are introduced because some genetic identifiers look very like dates to Excel. If the spreadsheet is not properly set up, it will convert an identifier, such as SEPT2 to a date: 2-Sep. The conversion, the researchers say, is irreversible: once the error has been introduced, the original data is gone.

In a paper published on BioMedCentral, Zeeberg et al explain that they noticed that some identifiers were being converted to non gene names.

"A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package," they write. "The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included."

The researchers suggest several workarounds for the problem, which you can find here, but caution that despite these "even the most vigilant investigator can inadvertently introduce conversion errors, and it is often necessary to screen data received from other sources". ®

Related stories

Medical imaging research awarded £4.5m
University gets £1m complex systems grant
DNA-based nanobot takes a stroll

Choosing a cloud hosting partner with confidence

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.