Excel ate my DNA
Autoformating black hole
Posted in Applications, 16th July 2004 12:17 GMT
Join our expert panel in discussing application security
Genetic research is being hampered by a smart formatting function in Excel, according to US researchers.
The problem, which can cause medically important genes to be hidden from view, is widespread, and has affected some public databases, including the gene expression data on the NCBI LocusLink database in the US, the researchers say.
Excel is widely used in genetic research to process microarray data. A microarray chip detects amounts of protein produced from thousands of different genes, enabling researchers to see which particular gene is being expressed in a sample of diseased tissue, for example.
The errors are introduced because some genetic identifiers look very like dates to Excel. If the spreadsheet is not properly set up, it will convert an identifier, such as SEPT2 to a date: 2-Sep. The conversion, the researchers say, is irreversible: once the error has been introduced, the original data is gone.
In a paper published on BioMedCentral, Zeeberg et al explain that they noticed that some identifiers were being converted to non gene names.
"A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package," they write. "The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included."
The researchers suggest several workarounds for the problem, which you can find here, but caution that despite these "even the most vigilant investigator can inadvertently introduce conversion errors, and it is often necessary to screen data received from other sources". ®
Related stories
Medical imaging research awarded £4.5m
University gets £1m complex systems grant
DNA-based nanobot takes a stroll


Solving on-premise email challenges with on-demand services
The business case for application security
Airport insecurity: the case of lost laptops
The best practices guide for application security
Impact of the dramatic increase in devices on the cost to support
Google code cloud punts on-demand embarrassment
Microsoft weighs next-phase in open-source support
iTunes minus the player: hack your Apple beats
Oracle plans cloud strategy