Feeds

DARPA seeks 'Machine Reading' AI auto-analysis bot

Human analysts' unanimous view - it'll never work

Gartner critical capabilities for enterprise endpoint backup

The US military is seeking revolutionary new AI software which would be able to read text - and so effectively do research - in the same way that humans do. The so-called "Machine Reading" ware would initially be used for such tasks as automated military-intelligence analysis, but it would have wide consequences in civilian life as well.

As regular readers would expect, the auto-reader notion comes from DARPA. DARPA has the usual Pentagon revolving doors connecting it to the military-industrial complex and the ivory tower; it also has an express lift to the castle in the sky, a secret tunnel to the secure wing of the scientists' insane asylum and a special shooting range situated on both sides of the Grand Canyon - where only long shots can be taken.

DARPA explain the problem thus:

The U.S. military frequently faces impediments to stability and reconstruction operations in a new location due to the lack of understanding of the local situation. Similarly, strategic assessment of a foreign nation’s science and technology base involves the continuous assessment of technical articles, bibliographies, conference agendas, etc. This information is often available on the World Wide Web, and some tools to assist this analysis are available, but the process would be significantly enhanced by a system that could directly analyze the information found in these text sources. The same reasoning could be equally valuable if applied to other types of open-source intelligence analysis, including assessing military readiness and posturing; political speeches, actions, and more obscure “messages”; economic trends and sentiments; and propaganda from terrorist groups and even their hidden web-based communications.

In other words, rather than having thousands of expensive human intel analysts surfing the web and putting together reports, why not just have some computers?

DARPA graphic explaining how you evaluate AI auto-reader software

It's as simple as a funnel stuck into your head

Well, it seems that it's all a matter of reasoning. The DARPA people say that computer software, even software dubbed "AI", doesn't do well at pulling together information in different kinds of document the way a human can - and indeed struggles to tie together information in single documents.

The Pentagon brainiacs say that this is because computers are best at "formal" language. While they can work with text, if the text is written in "natural" human languages - for instance standard English prose, as opposed to data in fields - the machines don't do very well.

That's where "Machine Reading" AI-ware would come in. According to DARPA, what's needed is simple to describe in natural language:

A successful Machine Reading Program will be enabling all knowledge encoded as natural text to be combined with the power of AI reasoning methodologies, which will unleash a wide variety of new AI applications ranging from intelligent bots to personal tutors. For example, all of the text in the World Wide Web will become available for automating the monitoring and analysis of technological and political activities of nations; plans, rhetoric, and activities of transnational organizations; and scientific discovery within various disciplines.

However, there's another thing which doesn't lend itself to natural language apart from computers - that is, federal contract awards. There has to be a formal-language way to decide which competing Machine Reader systems are better than others, and indeed much of the DARPA solicitation (pdf) is devoted to accomplishing that very feat (see pic, which makes the methods clear even to the meanest intelligence).

Normally, of course, one looks at AI systems from DARPA and it's clear that if they work the consequences would be awful: humanity's few survivors scuttling like rats in the ruins of civilisation, etc.

In this case, the casualties might be more tightly focused. Essentially, anyone whose job involves primarily reading text on the internet and producing more text from it would be out of work. Graduate students in the humanities subjects, intelligence analysts, think-tank people, quangocrats ... and about 90 per cent of journalists.

Plainly a mad notion; it'll never happen. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
'Stop dissing Google or quit': OK, I quit, says Code Club co-founder
And now a message from our sponsors: 'STFU or else'
Ex US cybersecurity czar guilty in child sex abuse website case
Health and Human Services IT security chief headed online to share vile images
Don't even THINK about copyright violation, says Indian state
Pre-emptive arrest for pirates in Karnataka
The police are WRONG: Watching YouTube videos is NOT illegal
And our man Corfield is pretty bloody cross about it
Felony charges? Harsh! Alleged Anon hackers plead guilty to misdemeanours
US judge questions harsh sentence sought by prosecutors
Oz biz regulator discovers shared servers in EPIC FACEPALM
'Not aware' that one IP can hold more than one Website
Apple tried to get a ban on Galaxy, judge said: NO, NO, NO
Judge Koh refuses Samsung ban for the third time
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.