Feeds

Petabyte-chomping big sky telescope sucks down baby code

Beyond the MySQL frontier

Providing a secure and efficient Helpdesk

Robert Heinlein was right to be worried. What if there really is a planet of giant, psychic, human-hating bugs out there, getting ready to hurl planet-busting rocks in our general direction? Surely we would want to know?

Luckily, big science projects such as the Large Synoptic Survey Telescope (LSST), which (when it's fully operational in 2016) will photograph the entire night sky repeatedly for 10 years, will be able to spot such genocidal asteroids - although asteroid-spotting is just one small part of the LSST's overall mission.

Two years ago we spoke to Jeff Kantor, LSST data management project manager, who described the project as "a proposed ground-based 6.7 meter effective diameter (8.4 meter primary mirror), 10 square-degree-field telescope that will provide digital imaging of faint astronomical objects across the entire sky, night after night."

I caught up with Jeff again a couple of weeks ago, and asked him how this highly ambitious project is progressing. "Very nicely" seems to be the crux of his answer.

It might not make for the most dramatic of headlines but given the scale and complexity of what's being developed, this in itself is a laudable achievement. In Jeff's words: "First, we have to process 6.4GB images every 15 seconds. As context, it would take 1,500 1080p HD monitors to display one image at full resolution.

"The images must go through a many-step pipeline in under a minute to detect transient phenomena, and then we have to notify the scientific community across the entire world about those phenomena. That will take a near real-time 3,000-core processing cluster, advanced parallel processing software, very sophisticated image processing and astronomical applications software, and gigabit/second networks.

"Next, we have to re-process all the images taken since the start of the survey every year for 10 years to generate astronomical catalogs, and before releasing them we need to quality assure the results."

That's about 5PB of image data/year, over 10 years, resulting in 50PB of image data and over 10PB of catalogs. The automated QA alone will require a 15,000-core cluster (for starters), parallel processing and database software, data mining and statistical analysis, and advanced astronomical software.

They now have a prototype system of about 200,000 lines of C++ and Python representing most of the capability needed to run an astronomical survey of the magnitude typically done today. Next, they have to scale this up to support LSST volumes. According to Jeff: "We hope to have all of that functioning at about 20 per cent of LSST scale of the end of our R&D phase. We then have six years of construction and commissioning to 'bullet-proof' and improve it, and to test it out with the real telescope and camera."

The incremental development and R&D mode the team is following could be called agile, although this is agile on a grand scale. Each year or six months, they do a new design and a new software release, called a Data Challenge. Each DC is a complete project with a plan, requirements, design, code, integration and test, and production runs.

Lessons learned

The fifth release just went out the door, and they've completely re-done their UML-based design third times with the lessons learned from each DC. They're using Enterprise Architect to develop each model, following a version of the agile ICONIX object modeling process tailored for algorithmic (rather than use case driven) development. I've co-authored a book on the ICONIX process, Use Case Driven Object Modeling with UMLTheory and Practice, here.

ICONIX uses a core subset of the UML rather than every diagram under the sun, and this leanness has allowed them to roll the content into a new model as a starting point for the next DC.

Jeff explains: "After each DC, we also extract the design/lessons learned from the DC model to the LSST Reference Design Model which is the design for the actual operational system. That last model is also used to trace up to a SysML-based model containing the LSST system-level requirements."

Internet Security Threat Report 2014

More from The Register

next story
MARS NEEDS WOMEN, claims NASA pseudo 'naut: They eat less
'Some might find this idea offensive' boffin admits
LOHAN crash lands on CNN
Overflies Die Welt en route to lively US news vid
Experts brand LOHAN's squeaky-clean box
Phytosanitary treatment renders Vulture 2 crate fit for export
No sail: NASA spikes Sunjammer
'Solar sail' demonstrator project binned
Carry On Cosmonaut: Willful Child is a poor taste Star Trek parody
Cringeworthy, crude and crass jokes abound in Steven Erikson’s sci-fi debut
Origins of SEXUAL INTERCOURSE fished out of SCOTTISH LAKE
Fossil find proves it first happened 385 million years ago
Human spacecraft dodge COMET CHUNKS pelting off Mars
Odyssey orbiter yet to report, though - comet's trailing trash poses new threat
You can crunch it all you like, but the answer is NOT always in the data
Hear that, 'data journalists'? Our analytics prof holds forth
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.