Feeds

Collecta - real-time search in real-time

Let the links come to you

Boost IT visibility and business value

Following the (semi-)success of Twitter's self-search engine - meant to tell you what the world is thinking right now - there's no shortage of web-happy outfits scrambling to crack so-called real-time search. That includes everyone from Google and Facebook to Web2.0rhea-loving startups like Tweefind and Twingly. Yes, Tweefind and Twingly.

Collecta falls into the latter category. But its real-time search is a bit more real-time than most.

Due for public launch today, Collecta leans on the Extensible Messaging and Presence Protocol (XMPP), the open protocol that underpins Mountain View's new-age collaboration thingy, Google Wave. Using XMPP, Collecta automatically receives content from web publishers without having to actively retrieve it.

"The way the web works for http right now is that you have to go and request a page," says chief executive Gerry Campbell, who previously served as president of search and content technologies at Reuters and as senior vp of search for AOL. "With RSS or anything else, you have to go out and ask for that information. But we're getting push feeds."

The WordPress blogging platform, for instance, uses XMPP push. "When anything is published on the WordPress platform - whether it's a blog post or a comment - it's pushed to us immediately. That allows us - from the moment the blogger posts - to get that content to our users in half a second."

The rub is that XMPP is rare among publishers. The list of users includes WordPress, the Iowa State weather site, and, well, that's about it. Collecta is forced to retrieve most content via RSS and other "pull" mechanisms. Naturally, it taps into those Twitter APIs - and Tweets seem to account for a healthy portion of its results. "We have an unlimited number of feeds we're pulling in with traditional means," Campbell says.

But other publishers are pushing data to Collecta via non-XMPP protocols, including file transfer protocol. And on the other end, Collecta uses XMPP to push all search results down to the user. If you've just searched on Barack Obama and new Barack content hits Collecta's servers, additional links will automatically appear in your browser - without a refresh.

Collecta

Collecta collecting

Like Twitter's search engine, Collecta doesn't even attempt to rank results according to Google-style relevancy. All searched results are listed according to the date and time they hit the web - and that's it.

"We're throwing aside the common conception of what search is all about, that it's about gathering links, creating an index, doing link analysis, and applying a ranking algorithm to show users results," Campbell says. "That's useful, but you can't watch things unfold in real-time. That's what we're exposing to people."

OK. Fine. But we can report that before today's planned launch the Collecta beta had consistent problems connecting back to the company's servers. At one point the evening before, the site was completely unreachable. And even when it was reachable, performance was spotty.

Despite this - and despite XMPP's relative lack of uptake - Collecta's chief technology officer, Jack Moffitt, is optimistic. Moffitt sits on the board of directors of the XMPP Standards and Xiph.org Foundations, and he has a long history with open-standards and open-source development. WordPress added XMPP at Collecta's request but Moffitt argues that more publishers will soon follow suit, putting even more "real" in the company's real-time search.

"We're talking to more and more people about getting XMPP set up," Moffitt tells us. "And content publishers are pretty interested. Something like RSS is pretty expensive. What you end up with is a situation like Twitter where they get billions of requests a day where they just say: 'No there's nothing now.' They can avoid all of that." ®

The Essential Guide to IT Transformation

More from The Register

next story
BBC goes offline in MASSIVE COCKUP: Stephen Fry partly muzzled
Auntie tight-lipped as major outage rolls on
iPad? More like iFAD: We reveal why Apple fell into IBM's arms
But never fear fanbois, you're still lapping up iPhones, Macs
Sonos AXES support for Apple's iOS4 and 5
Want to use your iThing? You can't - it's too old
Stick a 4K in them: Super high-res TVs are DONE
4,000 pixels is niche now... Don't say we didn't warn you
Philip K Dick 'Nazi alternate reality' story to be made into TV series
Amazon Studios, Ridley Scott firm to produce The Man in the High Castle
There's NOTHING on TV in Europe – American video DOMINATES
Even France's mega subsidies don't stop US content onslaught
You! Pirate! Stop pirating, or we shall admonish you politely. Repeatedly, if necessary
And we shall go about telling people you smell. No, not really
Too many IT conferences to cover? MICROSOFT to the RESCUE!
Yet more word of cuts emerges from Redmond
Joe Average isn't worth $10 a year to Mark Zuckerberg
The Social Network deflates the PC resurgence with mobile-only usage prediction
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.