Feeds

Microsoft Bing rides open source to semantic search

Powerset on the side

The essential guide to IT transformation

As it turns out, Powerset's open-source-happy semantic talents are only a small part of Bing, Microsoft's freshly-minted decision engine search engine.

Microsoft acquired Powerset last July in a reported $100m deal, and after a conspicuous Tweet from Powerset co-founder Barney Pell, many assumed that the semantic search outfit would play a major role in Redmond's latest attempt to catch the uncatchable Google.

According to a blog post from Scott Prevost, general manager of Microsoft's Powerset division, the division has tweaked Microsoft's primary search engine in certain "subtle" ways. But its main contribution is a secondary engine that searches nothing but Wikipedia. In essence, Microsoft's has taken Powerset's existing Wikitool and latched it to the Bing torso.

"The Powerset division has contributed to Bing in both subtle and more conspicuous ways. While the subtle contributions are important, they are much harder to showcase. This post will focus on how the features that our users have come to love on Powerset.com have evolved and have been integrated into Bing," Prevost says, before detailing Bing's "Reference" tab.

As we reported yesterday, the Reference tab reproduces Wikipedia articles in their entirety. When you search on, say, Albert Einstein, the tab will appear on the left hand side of the page, and if you click on it, you're taken to a reproduction of Einstein's Wikipedia entry (licensed at no cost from the "free encyclopedia anyone can edit").

Yes, Microsoft has solidified Wikipedia's place as the web's number-one source of truthiness.

But from that Reference tab you can also tap into Powerset's semantic Wikisearch, which the company originally unfurled in May of last year, before the Microsoft acquisition. This vertical search engine is designed to accept natural-language queries, such as "Was Einstein married?" - though that's not immediately obvious from Bing's layout. In a video attached to Prevost's blog post, Powerset founder Lorenzo Thione acknowledge that some of Bing's Powerset tools are "a little bit hidden. Over time, we'll definitely work on making it more accessible and visible to users."

In the same video, Senior Program Manager Mark Johnson says that in a few cases, Microsoft has hooked Powerset's natural-language platform into some of Bing's other search verticals, including the "Business" tab. But Thione calls these "pilot tests."

"There are a subset of queries where you use a more natural-language oriented syntax or you ask questions, similar to what Powerset.com used to support, we will get you answers right there on the page and a link back to the Reference vertical," he says.

Despite its limited role in the new search engine, Powerset's Bingification is a Microsoft milestone. Powerset's platform leans heavily on open-source code. Most notably, its search index is generated via Hadoop, the same open-source distributed computing platform that juices Yahoo!'s search engine. Powerset originated Hadoop's HBase project, an effort to mimic Google's famous distributed storage system, BigTable, and two of its employees, Michael Stack and Jim Kellerman, are full-time HBase committers.

According to Sam Ramji, Microsoft's senior director of platform strategy and the man who oversees the company's open source thinking, "This is the first time we have acquired a company with committers to a key open source project who have been able to continue to commit to that project in their old capacity as part of their new role."

And thanks to its integration with Powerset's platform, Bing is one of the few Microsoft "shipping" products to actually incorporate open-source code. Ramji points out that from the early to late 90s, Microsoft's Windows TCP/IP stack included BSD code, and today Windows HPC includes code developed at Microsoft that was then offered up to Argonne National Lab (ANL) for open-sourcing. But since the arrival of Windows Vista, Bing is certainly the most high-profile Microsoft product to go the open-source route.

Ramji calls it part of Microsoft's "strategic shift and cultural change" towards the open-source world. And it's certainly nice to see. But on another level, it's rather amusing that the company that once called Linux a cancer and spent untold millions on Encarta is now resting its search-engine on Hadoop and Wikipedia. ®

5 things you didn’t know about cloud backup

More from The Register

next story
True fact: 1 in 4 Brits are now TERRORISTS
YouGov poll reveals terrible truth about the enemy within
Microsoft exits climate denier lobby group
ALEC will have to do without Redmond, it seems
Caught red-handed: UK cops, PCSOs, specials behaving badly… on social media
No Mr Fuzz, don't ask a crime victim to be your pal on Facebook
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Ballmer leaves Microsoft board to spend more time with his b-balls
From Clippy to Clippers: Hi, I see you're running an NBA team now ...
Video of US journalist 'beheading' pulled from social media
Yanked footage featured British-accented attacker and US journo James Foley
Primetime precrime? Minority Report TV series 'being developed'
I have to know. I have to find out what happened to my life
Assange™: Hey world, I'M STILL HERE, ignore that Snowden guy
Press conference: ME ME ME ME ME ME ME (cont'd pg 94)
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
BYOD's dark side: Data protection
An endpoint data protection solution that adds value to the user and the organization so it can protect itself from data loss as well as leverage corporate data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?