Feeds

“This MS Antitrust story was created by a computer program”

Can you tell?

  • alert
  • submit to reddit

Choosing a cloud hosting partner with confidence

Google's News service is remarkable: and the most astonishing thing about it is that it is generated automatically.

" The selection and placement of stories on this page were determined automatically by a computer program," says a note at the foot of each page.

But why stop there? Why not use Perl scripts to generate the copy, too? You don't need messy human wetware - foul drunken journalists - and it's much more of an "end-to-end" solution, whatever that may be. It could revolutionize the industry, because once you've done away with journalists, there's no need to employ expensive PRs to buy them drinks (or in Apple's case, "decline to comment".)

We've been secretly testing our own story generator, and here we shall reveal exactly how it works. Google keeps its algorithms and weighting secret - but we're delighted to share them with the world. But be patient: it's a work in progress. [The script's output is in bold type.]

Inside the NewsBot

What you need are input variables of such staggering predictability that the complex Bayesian pattern matching doesn’t produce random gibberish. There's a vast body of Register wisdom in there, drawn from years of watching the industry.

We've learnt to anticipate causal relationships between events. For example, Microsoft announces it's "getting serious" about security: within a few days, it will announce patches for hundreds of new vulnerabilities in Internet Explorer. If there's a SPARC or PowerPC roadmap, then you know that an "Itanium is ahead of schedule" story will soon follow.

Predicting these sequences requires enormous amounts of alcohol, but hey - someone's got to keep the machine's running.

But because these hierarchies of interdependencies are so complex, one bad variable can wreak havoc.

For example, this story yesterday was created by the bot. The story was thrown wildly off course by the introduction of a bad variable - from our nuclear science database, which we hardly ever use - and so made little sense. GIGO (garbage in, garbage out) is a familiar result to any programmer.

So let's see it in action.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer vowed to crack down on Martha Stewart Z%) Pitt% Pitt% Pitt% SettlZ~06^Z

Oops, sorry. Let's terminate that run.

Bring me fresh variables, Igor!

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James rose to a height of 40,000 feet, and was carried several hundred miles south before landing safely in a swamp in the Florida k^Z.

Apologies again, this run eliminated the line noise, but is still clearly absurd.

We need to reset the variables excluding anything unpredictable and run the script once again.

$ mkstory "Antitrust" "Microsoft" "Consequence"
In Washington the Department of Justice's chief Antitrust enforcer Charles James said that it isn't the job of the Antitrust division to enforce the settlement with Microsoft. James said that enforcement should be performed by the nine dissenting States. James leaves on November 22 to join ChevronTexaco Corporation.

Script completed 13:44 PT, CPU Time 0:05:11,
Yesss!

With predictable inputs, we get an entirely predictable result: the story scans well and contains no surprises.

It's also entirely true - the Bush administration's Antitrust enforcer doesn't want to enforce Antitrust settlements and he's leaving to join an oil company.

We declare the experiment a success. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Preview redux: Microsoft ships new Windows 10 build with 7,000 changes
Latest bleeding-edge bits borrow Action Center from Windows Phone
Google opens Inbox – email for people too thick to handle email
Print this article out and give it to someone tech-y if you get stuck
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
UNIX greybeards threaten Debian fork over systemd plan
'Veteran Unix Admins' fear desktop emphasis is betraying open source
Entity Framework goes 'code first' as Microsoft pulls visual design tool
Visual Studio database diagramming's out the window
Google+ goes TITSUP. But WHO knew? How long? Anyone ... Hello ...
Wobbly Gmail, Contacts, Calendar on the other hand ...
DEATH by PowerPoint: Microsoft warns of 0-day attack hidden in slides
Might put out patch in update, might chuck it out sooner
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.