Feeds

Algorithm ramps up genetic computation

'Sailfish' boosts RNA gene expression predictions

Intelligent flash storage arrays

The world has built DNA genomes for a long time, but applying what we know about genetics to everyday medicine is a tough ask.

For example, readers might remember that the business of crafting treatments from genes is so complex that IBM recently entered a partnership to get its Watson megabrain learning to help medicos craft personalised treatments for cancer.

Part of the problem that researchers want to solve is “gene expression”: in all the complexities of how genes interact, what interactions are “expressed” in a physical trait? – whether that trait is blue eyes, or why one individual dies of a cancer that's arrested in someone else.

What's wanted is a way to predict gene expression, and one angle of the research is based on RNA sequencing (RNA-seq) data. The problem is that analysing RNA sequencing is a slow business, and that's where the research out of Carnegie-Mellon University and the University of Maryland comes in. Their Sailfish algorithm dramatically accelerates estimates of the likely outputs of RNA sequence.

To explain why this is important, the researchers' release says: “Though an organism's genetic makeup is static, the activity of individual genes varies greatly over time, making gene expression an important factor in understanding how organisms work and what occurs during disease processes. Gene activity can't be measured directly, but can be inferred by monitoring RNA, the molecules that carry information from the genes for producing proteins and other cellular activities.”

However, analysing the RNA-seq “reads” – short sequences of RNA – traditionally results in huge datasets that have to be mapped back to their original genetic processes. The Sailfish “secret sauce” (except that it's not so secret – the code has been released here) is that it skips this painstaking mapping step.

Instead, the researchers “found they could allocate parts of the reads to different types of RNA molecules, much as if each read acted as several votes for one molecule or another”. Think of it as upvoting posts in a forum: individual votes bestow a kind of consensus on which reads – or posts – carry the greatest significance.

Getting what might be a 15-hour analysis down to minutes is important, the researchers believe: there are already huge repositories of RNA-seq data, but turning data into insight is held back by computational effort.

Fifteen hours for each analysis “really starts to add up, particularly if you want to look at 100 experiments”, explains Carnegie-Mellon associate professor Carl Kingsford. “With Sailfish, we can give researchers everything they got from previous methods, but faster.” ®

Providing a secure and efficient Helpdesk

More from The Register

next story
MARS NEEDS WOMEN, claims NASA pseudo 'naut: They eat less
'Some might find this idea offensive' boffin admits
SECRET U.S. 'SPACE WARPLANE' set to return from SPY MISSION
Robot minishuttle X-37B returns after almost 2 years in orbit
LOHAN crash lands on CNN
Overflies Die Welt en route to lively US news vid
You can crunch it all you like, but the answer is NOT always in the data
Hear that, 'data journalists'? Our analytics prof holds forth
Experts brand LOHAN's squeaky-clean box
Phytosanitary treatment renders Vulture 2 crate fit for export
No sail: NASA spikes Sunjammer
'Solar sail' demonstrator project binned
America's super-secret X-37B plane returns to Earth after nearly TWO YEARS aloft
674 days in space for US Air Force's mystery orbital vehicle
Carry On Cosmonaut: Willful Child is a poor taste Star Trek parody
Cringeworthy, crude and crass jokes abound in Steven Erikson’s sci-fi debut
Origins of SEXUAL INTERCOURSE fished out of SCOTTISH LAKE
Fossil find proves it first happened 385 million years ago
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.