The Register® — Biting the hand that feeds IT

Excel ate my DNA

Autoformating black hole

Join our expert panel in discussing application security

Genetic research is being hampered by a smart formatting function in Excel, according to US researchers.

The problem, which can cause medically important genes to be hidden from view, is widespread, and has affected some public databases, including the gene expression data on the NCBI LocusLink database in the US, the researchers say.

Excel is widely used in genetic research to process microarray data. A microarray chip detects amounts of protein produced from thousands of different genes, enabling researchers to see which particular gene is being expressed in a sample of diseased tissue, for example.

The errors are introduced because some genetic identifiers look very like dates to Excel. If the spreadsheet is not properly set up, it will convert an identifier, such as SEPT2 to a date: 2-Sep. The conversion, the researchers say, is irreversible: once the error has been introduced, the original data is gone.

In a paper published on BioMedCentral, Zeeberg et al explain that they noticed that some identifiers were being converted to non gene names.

"A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package," they write. "The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included."

The researchers suggest several workarounds for the problem, which you can find here, but caution that despite these "even the most vigilant investigator can inadvertently introduce conversion errors, and it is often necessary to screen data received from other sources". ®

Related stories

Medical imaging research awarded £4.5m
University gets £1m complex systems grant
DNA-based nanobot takes a stroll

Join our expert panel in discussing application security

Don’t Miss

GoogleGoogle code cloud punts on-demand embarrassment

Fail and You Mountain View's Sarah Palin moment

open source 75Microsoft weighs next-phase in open-source support

Spring, PHP, and Apache sized up

iTunes logoiTunes minus the player: hack your Apple beats

Mac Secrets Dodge the shareware sledgehammer

OracleOracle plans cloud strategy

Exclusive Larry smells money in madness