Feeds

IBM gets handle on unstructured data

Almaden incarnates search and BI

Combat fraud and increase customer satisfaction

“A few years ago IBM got together with a few agencies from the US Government, plus several research institutions, where it was decided to develop a standard framework which would allow all the proprietary algorithms to be plugged in so that others could take advantage of them,” he said. “It was announced last year that we were going to give the implementation of this framework (called Unstructured Information Management Architecture - UIMA) to the open source community. It can now be used to use search arguments to identify documents where a connection has been inferred, even though none of the keywords of the search argument are in them.”

Project Avatar puts a layer of intelligence on top of the primitive interface layer. This performs semantic analyses of search requests in order to surface more comprehensive results. To demonstrate, Mattos used the simple example of searching for the words 'John' and 'Phone'.

"This will normally lead to all documents containing those words. But Avatar will infer the likelihood that the required answer is actually John's phone number. The system will then search for a document containing John's phone number, even if it does not contain those two words together. I may even use some description that I know I can associate with 'John' and the system will find a string that it associates with 'phone number'."

According to Mattos, the objective here is to use intelligence in the search interface to allow users to start looking for concepts rather than keywords. From here it is then possible to start extracting the concepts out of documents, which in turn will allow users to start generating information even before anyone has asked for specific facts. An example would be examining records from a call centre and being able to determine the percentage of calls complaining about quality and/or new maintenance contract terms, and from that rapidly pinpoint areas in customer and product support that need to be addressed to improve the customers’ experience.

This is, arguably, the essence of BI, surfacing possible answers to questions that can impact the ongoing performance of a business, before the business user has even formulated them.

One of the most important types of unstructured data is the voice, and Almaden has already developed speech recognition that can record the voice of the caller and do speech to text conversion in any of the major languages. "We can even do language to language translation, so could take worldwide records, translate them all into English and do analysis on the results. This information can then be incorporated with typed records.

An interesting side issue here is the added ability to analyse customers' spoken interactions with company staff to assess factors such as customer satisfaction. "This is done using sentiment analysis, which can only be obtained from the voice, whether someone was angry, upset or whatever, not the text,” Mattos said. "That can't be done in real time, however."

From a users' point of view, Project Avatar will make it possible to have a single, customer-defined `company-standard’ UI to make inquiries of both structured and unstructured data, where users will be able to search for concepts.

As this is now out in the open source community, it is likely that DIYBI tools will start appearing that are a mixture of search engine and BI tool. It is unlikely to come from IBM of course, though the company has delivered a search engine for enterprises called Omnifind which is built on the technology. ®

SANS - Survey on application security programs

More from The Register

next story
Next Windows obsolescence panic is 450 days from … NOW!
The clock is ticking louder for Windows Server 2003 R2 users
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
prev story

Whitepapers

Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.