Feeds

“This MS Antitrust story was created by a computer program”

Can you tell?

  • alert
  • submit to reddit

Internet Security Threat Report 2014

Google's News service is remarkable: and the most astonishing thing about it is that it is generated automatically.

" The selection and placement of stories on this page were determined automatically by a computer program," says a note at the foot of each page.

But why stop there? Why not use Perl scripts to generate the copy, too? You don't need messy human wetware - foul drunken journalists - and it's much more of an "end-to-end" solution, whatever that may be. It could revolutionize the industry, because once you've done away with journalists, there's no need to employ expensive PRs to buy them drinks (or in Apple's case, "decline to comment".)

We've been secretly testing our own story generator, and here we shall reveal exactly how it works. Google keeps its algorithms and weighting secret - but we're delighted to share them with the world. But be patient: it's a work in progress. [The script's output is in bold type.]

Inside the NewsBot

What you need are input variables of such staggering predictability that the complex Bayesian pattern matching doesn’t produce random gibberish. There's a vast body of Register wisdom in there, drawn from years of watching the industry.

We've learnt to anticipate causal relationships between events. For example, Microsoft announces it's "getting serious" about security: within a few days, it will announce patches for hundreds of new vulnerabilities in Internet Explorer. If there's a SPARC or PowerPC roadmap, then you know that an "Itanium is ahead of schedule" story will soon follow.

Predicting these sequences requires enormous amounts of alcohol, but hey - someone's got to keep the machine's running.

But because these hierarchies of interdependencies are so complex, one bad variable can wreak havoc.

For example, this story yesterday was created by the bot. The story was thrown wildly off course by the introduction of a bad variable - from our nuclear science database, which we hardly ever use - and so made little sense. GIGO (garbage in, garbage out) is a familiar result to any programmer.

So let's see it in action.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer vowed to crack down on Martha Stewart Z%) Pitt% Pitt% Pitt% SettlZ~06^Z

Oops, sorry. Let's terminate that run.

Bring me fresh variables, Igor!

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James rose to a height of 40,000 feet, and was carried several hundred miles south before landing safely in a swamp in the Florida k^Z.

Apologies again, this run eliminated the line noise, but is still clearly absurd.

We need to reset the variables excluding anything unpredictable and run the script once again.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James said that it isn't the job of the Antitrust division to enforce the settlement with Microsoft. James said that enforcement should be performed by the nine dissenting States. James leaves on November 22 to join ChevronTexaco Corporation.

Script completed 13:44 PT, CPU Time 0:05:11,
Yesss!

With predictable inputs, we get an entirely predictable result: the story scans well and contains no surprises.

It's also entirely true - the Bush administration's Antitrust enforcer doesn't want to enforce Antitrust settlements and he's leaving to join an oil company.

We declare the experiment a success. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Sway: Microsoft's new Office app doesn't have an Undo function
Content aggregation, meet the workplace ... oh
Sign off my IT project or I’ll PHONE your MUM
Honestly, it’s a piece of piss
Do Moan! MONSTER 6-day EMAIL OUTAGE hits Domain Monster
Customers freaked out by frightful service
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
NetWare sales revive in China thanks to that man Snowden
If it ain't Microsoft, it's in fashion behind the Great Firewall
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.