Feeds

“This MS Antitrust story was created by a computer program”

Can you tell?

  • alert
  • submit to reddit

5 things you didn’t know about cloud backup

Google's News service is remarkable: and the most astonishing thing about it is that it is generated automatically.

" The selection and placement of stories on this page were determined automatically by a computer program," says a note at the foot of each page.

But why stop there? Why not use Perl scripts to generate the copy, too? You don't need messy human wetware - foul drunken journalists - and it's much more of an "end-to-end" solution, whatever that may be. It could revolutionize the industry, because once you've done away with journalists, there's no need to employ expensive PRs to buy them drinks (or in Apple's case, "decline to comment".)

We've been secretly testing our own story generator, and here we shall reveal exactly how it works. Google keeps its algorithms and weighting secret - but we're delighted to share them with the world. But be patient: it's a work in progress. [The script's output is in bold type.]

Inside the NewsBot

What you need are input variables of such staggering predictability that the complex Bayesian pattern matching doesn’t produce random gibberish. There's a vast body of Register wisdom in there, drawn from years of watching the industry.

We've learnt to anticipate causal relationships between events. For example, Microsoft announces it's "getting serious" about security: within a few days, it will announce patches for hundreds of new vulnerabilities in Internet Explorer. If there's a SPARC or PowerPC roadmap, then you know that an "Itanium is ahead of schedule" story will soon follow.

Predicting these sequences requires enormous amounts of alcohol, but hey - someone's got to keep the machine's running.

But because these hierarchies of interdependencies are so complex, one bad variable can wreak havoc.

For example, this story yesterday was created by the bot. The story was thrown wildly off course by the introduction of a bad variable - from our nuclear science database, which we hardly ever use - and so made little sense. GIGO (garbage in, garbage out) is a familiar result to any programmer.

So let's see it in action.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer vowed to crack down on Martha Stewart Z%) Pitt% Pitt% Pitt% SettlZ~06^Z

Oops, sorry. Let's terminate that run.

Bring me fresh variables, Igor!

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James rose to a height of 40,000 feet, and was carried several hundred miles south before landing safely in a swamp in the Florida k^Z.

Apologies again, this run eliminated the line noise, but is still clearly absurd.

We need to reset the variables excluding anything unpredictable and run the script once again.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James said that it isn't the job of the Antitrust division to enforce the settlement with Microsoft. James said that enforcement should be performed by the nine dissenting States. James leaves on November 22 to join ChevronTexaco Corporation.

Script completed 13:44 PT, CPU Time 0:05:11,
Yesss!

With predictable inputs, we get an entirely predictable result: the story scans well and contains no surprises.

It's also entirely true - the Bush administration's Antitrust enforcer doesn't want to enforce Antitrust settlements and he's leaving to join an oil company.

We declare the experiment a success. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Microsoft boots 1,500 dodgy apps from the Windows Store
DEVELOPERS! DEVELOPERS! DEVELOPERS! Naughty, misleading developers!
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Best practices for enterprise data
Discussing how technology providers have innovated in order to solve new challenges, creating a new framework for enterprise data.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Advanced data protection for your virtualized environments
Find a natural fit for optimizing protection for the often resource-constrained data protection process found in virtual environments.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?