Feeds

IBM open sources enterprise search

Takes unstructured approach

  • alert
  • submit to reddit

High performance access to file storage

IBM is open sourcing a jointly developed search architecture with a view to creating a common industry approach to querying unstructured enterprise data.

The company is today expected to announce plans to open source the Unstructured Information Management Architecture (UIMA) used in the company's WebSphere Information Integrator OmniFind Edition, WebSphere Portal Server and Lotus Workplace. IBM is also open sourcing the UIMA toolkit.

UIMA provides a framework for software tools and services capable of conducting context-based searches across millions of unstructured records, databases, content repositories and email systems.

UIMA was developed by IBM Research with "significant" input from the Defense Advanced Research Projects Agency (DARPA), along with other contributors.

Nelson Mattos, IBM distinguished engineer and vice president of information integration, said UIMA could return hundreds of relevant documents from a search query compared to a key word search-based approach that would return millions of documents.

According to Mattos, IBM hopes to create an industry standard for text analytics through the release of the code. "The key goal is to create a forum for other research institutions to contribute to and develop the framework without having to depend purely on IBM to support it," he said.

IBM also hopes to attract buy-in from the commercial sector. As such, IBM is today also expected to announce 15 companies will use UIMA as the framework for planned search and text analysis tools.

Open sourcing UIMA is the first-step in a process that could see IBM adopt existing industry standards for use with the architecture. IBM said it would investigate use of the Object Management Group's (OMG's) Unified Modeling Language (UML), eCore, and XML Metadata Interchange (XMI) with the UIMA's Common Analysis Structure (CAS) specification later this year. CAS handles data exchange between UIMA's various components. ®

Related stories

IBM 'really committed' to Java community
IBM and Google find each other in desktop search
Search pioneers join Yahoo! - but is the web beyond search?

High performance access to file storage

More from The Register

next story
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
OpenSSL Heartbleed: Bloody nose for open-source bleeding hearts
Bloke behind the cockup says not enough people are helping crucial crypto project
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Windows XP still has 27 per cent market share on its deathbed
Windows 7 making some gains on XP Death Day
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
US taxman blows Win XP deadline, must now spend millions on custom support
Gov't IT likened to 'a Model T with a lot of things on top of it'
prev story

Whitepapers

Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.