ET, phone back: Alien quest seeks earthling coders
Open source joins Search for Extra Terrestrial Intelligence
Bigger, deeper, more data
The new systems, going open source and hopping on the cloud, all reinforce an expanded research remit intended to give SETI, founded in 1984, a roadmap to 2020. SETI's current mission – called Prelude – is the largest in the organization's history, and will require more space data to be processed in the hunt for the next Wow!. Prelude is searching the radio spectrums 1.4 to 1.7 GHz, 2.8 to 3.4 GHz, and 4.6GHz. That compares to the prior Phoenix mission between 1995 and 2004 that searched the 1.2 to 3GHz range. Big Ear in 1977 scanned in the megahertz range: 8.4 million 0.05MHz channels.
ATA is scanning 30 million channels simultaneously for signals buried deep within the cosmic noise. It's also observing two stars at the same time to help rule out interference from terrestrial signals. That means ATA is now generating between 100TB and 200TB of data each day that scientists must comb through.
If SETI's plan pans out, there will be more data to sift. The goal is to scan between 1GHz and 10GHz – more that 13 billion channels each 0.7 Hz wide – and to do that for about one million stars. ATA will grow, too, from today's 42 telescopes to 350.
Another 308 to go: some of ATA's first 42 radio telescopes, funded by Paul Allen
The logic is simple: search more sky and you'll have more chances of finding something. "Suppose we are searching for the wrong kind of signal. We are doing a fantastic job in narrow-band signals but perhaps it's very wide-band, plus with lots of time structure," Tarter says.
"I'm not going to identify that with my narrow band detector. I'm much more likely to find something if I also look for broadband signals, complex signals, highly modular signals, noise-like signals. If we expand the volume of the search in the parameter space we are exploring, that's got to be better than only one type of signal."
More data means SETI needed faster real-time processing. During the era of Phoenix, SETI's white coats built their own PCs to process signals because the systems from Silicon Valley's server makers were deemed too slow.
SETI needed a system that could collect the radio frequencies from different telescopes, combine and process feeds, and scan and process data. To do this SETI built PCs using components it bought between 1997 and 1999. It had 32 machines running 1GHz Pentium IIIs and 2.66GHz Xeon chips, with 18.3GB hard dives and between 512MB and 1GB of DRAM. SETI added Motorola Blue Wave boards with one or two DSP chips for added horsepower, and customized circuit boards for specialized processing.
In the past, SETI researchers had used Digital Equipment Corp's PDP-11, and prior to that Big Ear ran the IBM 1130, a machine IBM started producing in 1965. Legend has it that their 1130 checked out prematurely and the SETI team was left hanging after a mouse built its nest in the machine's air intake, helping to burn out a hard drive. IBM couldn't guarantee that a repair could be successfully made.
Now, though, SETI has come up to date. It's thrown out the 32 machines it built and quit the OEM biz completely. In those old machines' place are 64 quad-core Dell 6100 servers donated by Dell, Google, and Intel. "This is the technology capability we've been looking for. We've got it, we should run with it," Tarter said.
The search software that's running on the PCs – called SETI on the ATA (SonATA) – has been rewritten, as well. However, it was recoded not only so it could run on the quad-core servers, but also as part of a larger move towards releasing the APIs to the community under an open source license.
SonATA contains 440,000 lines of code written mostly in C, C++, and Java. That's not big by industry standards – Microsoft's Windows contains more than 50 million of lines of code, while the Linux kernel 2.6.34 saw more than 400,000 lines of code added just last year. As such, SETI Institute's director of open-innovation Avinash Agrawal, an ex–Sun Microsystems senior director of engineering that SETI hired in 2009 to drive the tech side of its great opening up, tells us this was "just" a simple porting project.
Licensing, the final frontier
Agrawal's team finished open sourcing SonATA on March 1, releasing it under GPLv3 – you can get the code on GitHub. The internal deadline was actually December 1, but the weeks between December and early 2011 were spent in QA before moving to production. "While, strictly speaking, we did not meet the deadline, we did pretty well," Agrawal told us after wrapping up the port and releasing the code to open source.
Agrawal tells us that the biggest challenge was licensing – going through thousands of lines of code to find what could be released under open source, and making the rest of the code safe from a patent perspective.
Next page: Your algorithms wanted