Humans best crossword-puzzling computer

Dr Fill is no Deep Blue or Watson – yet

Secure remote control for conventional and virtual desktops

Officially, humans are the only ones who can enter the American Crossword Puzzle Tournament, which was held in Brooklyn, New York over the weekend. But this time artificial intelligence expert Matt Ginsberg of On Time Systems has put his Dr Fill crossword solver to the test. As the results show, you don't have to throw out your pencil just yet.

This week, the puzzle masters at the ACPT put the hurt on Dr Fill, who according to a report in the New York Times, did terribly on two out of the five crossword puzzles and would have only ranked 141st among the 600 people that took the test. The program has done better in the past, and in simulated runs among fifteen former tournaments, Dr Fill came out on top three times.

"I'll be back next year," Ginsberg told the Times.

Ginsberg's company, located in Eugene, Oregon, is a niche player in industrial optimization. Among other things, On Time Systems has created algorithms behind the Green Driver application used in the cities of Eugene and Portland to anticipate red lights and to route around them, which can cut commute times by 5 per cent.

The company has also created the algorithms for routing the US Air Force's fleet of cargo aircraft around weather while still obeying air traffic rules. The Worldwide Aeronautical Route Planner was created to run on mainframes and its being modernized to run on laptops. The optimized routing saves the Air Force something on the order of 1 to 2 per cent on its fuel bill each year, which works out to $35m to $90m.

In his spare time, in addition to actually creating crossword puzzles for the Times, Ginsberg has been working on the Dr Fill program. Ginsberg submitted a paper describing Dr Fill to the Journal of Artificial Intelligence Research, which was published (PDF) in December 2011. Based on its past performance, Ginsberg reckons that Dr Fill ranks among the top 50 puzzle solvers in the world, even though it did not have a very good day on Sunday.

The Watson Jeopardy! question-answer system built by IBM that took on the two human champs (perhaps chimps or chumps might be more appropriate, given how badly Watson spanked Ken Jennings and Brad Rutter) did terribly at first, too, but with tweaks and tuning and algorithm changes, it got better.

So there's no question that Ginsberg will be back, and it might even turn out that he gets some help. The Gray Lady is Big Blue's hometown rag, so some collaboration might be in order. The puzzles that threw Dr Fill through a loop had words spelled backwards and some that had words diagonally as well as horizontally and vertically.

As with the Watson QA machine, Dr Fill is not actually understanding what it is doing, but using statistical probabilities to calculate its answers. Technically, what Dr Fill does is convert a crossword puzzle into a singly weighed constraint satisfaction problem. The constraints, of course, are the clues. Otherwise, you could just use a vast dictionary and try to cram every word of every length into every appropriate spot and then create a cross index of where letters match up where words cross.

Such an unconstrained problem is very tough to solve, so if you can "understand" the clues, then you can limit the possible answers and therefore the size of the dictionary you need to rifle through. The clues have all kinds of other cues that people process, such as a clue that ends in a question is usually tricky in some fashion.

Behind the Dr Fill program, Ginsberg has amassed a database of over 47,000 crossword puzzles and their solutions, with almost 1.9 million unique clues. The system also makes use of a dictionary with over six million words and a smaller dictionary with 8,542 common words.

The answers to all of these puzzles were then rated in a number of ways, including hand scoring them by 100 volunteers and cross-checking them for the number of Google hits, their Scrabble score if you were playing that word game, their length, and other criteria. Based on the 50,000 words, scoring was done for the remaining clues in the 47,000 puzzles.

The Dr Fill database also includes a database of Wikipedia titles, grammatical information about 154,000 words that tells Dr Fill their part of speech, and 1.2 million synonyms.

Armed with all this data and super-secret algorithms, Dr Fill solves crossword puzzles. Watson needed a rack of IBM Power 750 servers with 2,880 cores and 16TB of memory in a high-speed cluster, but Ginsberg says you can run Dr Fill on a notebook with two x86 cores and 8GB of memory with a compressed database that weighs in at 300MB.

Like many others, Ginsberg thinks that Watson cheated a bit. Humans did beat Watson in a dry run, and then IBM sped up the button pushing a bit for the real showdown.

"Watson, too, outperforms humans easily in terms of speed; its much-ballyhooed victory against human Jeopardy! competitors was probably due far more to Watson's mastery of button pushing than to its question-answering ability," writes Ginsberg in his JAIR paper. "In terms of the underlying cognitive task, Watson appears to not yet be a match for the best Jeopardy players, who are in general capable of answering virtually all of the questions without error."

Ginsberg says that the task for Dr Fill is a lot harder, because you can't decline to answer or bet to boost your earnings as you can on Jeopardy!. You have to find the right answers, period. And perhaps more significantly, Ginsberg is having a harder time as Dr Fill gets better and better of figuring out why the program is not working.

Perhaps we need to create Stallman? ®

Beginner's guide to SSL certificates

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
This time it's SO REAL: Overcoming the open-source orgasm myth with TODO
If the web giants need it to work, hey, maybe it'll work
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
Storage array giants can use Azure to evacuate their back ends
Site Recovery can help to move snapshots around
prev story


Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The Heartbleed Bug: how to protect your business with Symantec
What happens when the next Heartbleed (or worse) comes along, and what can you do to weather another chapter in an all-too-familiar string of debilitating attacks?