Scroogle resurrected once again
Private Google scraper in SCO-like refusal to die
Scroogle has once again returned from the dead, continuing to serve up its privacy-friendly Google search results after another programming tweak from founder Daniel Brandt.
Brandt and the not-for-profit Scroogle have been scraping Google search results since 2002, allowing netizens to use Mountain View's search engine without being tracked by the company. But in May, after Google removed an interface page where Brandt was scraping results, the service went offline. It returned a day later, as Brandt tapped a slightly different interface, only for this interface to vanish as well.
Originally, Brandt said he was unlikely to find another means of simply and reliably scraping Google, but he has now settled on a fix — though it's a bit more bloated than the old setup. "It's back up, and it looks like the old Scroogle to everyone. But I'm not a happy camper," he tells The Reg. "I finally spent two days reprogramming my parser in Scroogle...I do not like the extra bloat because the six dedicated servers I lease have monthly quotas for bandwidth."
Scroogle is run entirely with donations. You can support the cause here.
The service originally scraped results from an interface at google.com/ie, which Google built for use inside the sidebar offered by Internet Explorer 6. If you installed the Google Toolbar with IE6 and chose Google as the default search engine, google.com/ie would provide search results in the window that popped up on the side of the browser. These results had to be simple because the sidebar was small.
But as Google phased out all sorts of support for IE6, it did away with this interface. After google.com/ie went offline in May, Brandt was able to replicate his setup via another page — google.com/search — by adding an IE parameter ("&output=ie") to the url. But last week, this disappeared as well.
At Google's annual developer conference in May, über-Googler Matt Cutts indicated that Google was not specifically targeting Scroogle when it removed the google.com/ie interface. And it would appear that the same applies to the "&output=ie" interface.
In any event, Brandt has now switched to another method that still provides 100 results per page, does not include Google's "Universal" search results, and sidesteps Google's ads. "A Scroogle user discovered how to get generic results, without ads, and 100 at a time. It was a completely different format than the &output=ie interface, but at least it was merely three times more bloated," Brandt says. ®
Sponsored: Hyper-scale data management