Big Blue prototypes software for big, big data
Coping with the astronomy info-glut
IBM has prototyped a software architecture for the huge data demands of astronomy projects such as the SKA (square kilometer array).
One of the many problems created by a project as large as the SKA is that wherever it’s built – we’ll know next year if the South Africa bid or the Australia / New Zealand bid wins – it’s going to generate too much data to store.
With as much as Exabyte of data as its daily dump, the SKA will demand new techniques just so astronomers can use the facility’s output (cue: Mission Impossible theme).
Working with New Zealand-based radio astronomer, Dr Melanie Johnston-Hollitt from Wellington’s Victoria University, IBM has created the Information Intensive Framework prototype.
Under the framework, data will be classified into astronomical concepts, and overlaid with a guided search facility for faster data access and fewer errors. IBM says the prototype has also suggested further improvements to achieve the SKA’s performance demands.
"Undertaking research on exa-scale datasets will force radio astronomers into a new, as yet, unexplored regime of automated processing, imaging and analysis,” Dr Johnston-Hollitt said in IBM’s announcement.
“Surveys on even SKA precursor telescopes such as ASKAP and MWA are expected to produce catalogues of tens of millions of radio sources. How we organise and classify these data, which we will have in the next three years, is a significant challenge. We will need new solutions to fully realize the vast scientific potential of these datasets and it's fantastic that organisations like IBM are prepared to take up that challenge.”
If the A/NZ team wins the SKA contract, data will have to be pre-processed close to the telescopes (most of which would be in the remote north-west of Western Australia), then sent back to the Pawsey Supercomputing Centre in Perth for storage and analysis.
Since contracts are now open for the next phase of the Pawsey Centre’s implementation, The Register wouldn’t be surprised if IBM isn’t the only company trying to position itself with relevant software architectures. ®
Sponsored: Hyper-scale data management