Feeds

Humans best crossword-puzzling computer

Dr Fill is no Deep Blue or Watson – yet

5 things you didn’t know about cloud backup

Officially, humans are the only ones who can enter the American Crossword Puzzle Tournament, which was held in Brooklyn, New York over the weekend. But this time artificial intelligence expert Matt Ginsberg of On Time Systems has put his Dr Fill crossword solver to the test. As the results show, you don't have to throw out your pencil just yet.

This week, the puzzle masters at the ACPT put the hurt on Dr Fill, who according to a report in the New York Times, did terribly on two out of the five crossword puzzles and would have only ranked 141st among the 600 people that took the test. The program has done better in the past, and in simulated runs among fifteen former tournaments, Dr Fill came out on top three times.

"I'll be back next year," Ginsberg told the Times.

Ginsberg's company, located in Eugene, Oregon, is a niche player in industrial optimization. Among other things, On Time Systems has created algorithms behind the Green Driver application used in the cities of Eugene and Portland to anticipate red lights and to route around them, which can cut commute times by 5 per cent.

The company has also created the algorithms for routing the US Air Force's fleet of cargo aircraft around weather while still obeying air traffic rules. The Worldwide Aeronautical Route Planner was created to run on mainframes and its being modernized to run on laptops. The optimized routing saves the Air Force something on the order of 1 to 2 per cent on its fuel bill each year, which works out to $35m to $90m.

In his spare time, in addition to actually creating crossword puzzles for the Times, Ginsberg has been working on the Dr Fill program. Ginsberg submitted a paper describing Dr Fill to the Journal of Artificial Intelligence Research, which was published (PDF) in December 2011. Based on its past performance, Ginsberg reckons that Dr Fill ranks among the top 50 puzzle solvers in the world, even though it did not have a very good day on Sunday.

The Watson Jeopardy! question-answer system built by IBM that took on the two human champs (perhaps chimps or chumps might be more appropriate, given how badly Watson spanked Ken Jennings and Brad Rutter) did terribly at first, too, but with tweaks and tuning and algorithm changes, it got better.

So there's no question that Ginsberg will be back, and it might even turn out that he gets some help. The Gray Lady is Big Blue's hometown rag, so some collaboration might be in order. The puzzles that threw Dr Fill through a loop had words spelled backwards and some that had words diagonally as well as horizontally and vertically.

As with the Watson QA machine, Dr Fill is not actually understanding what it is doing, but using statistical probabilities to calculate its answers. Technically, what Dr Fill does is convert a crossword puzzle into a singly weighed constraint satisfaction problem. The constraints, of course, are the clues. Otherwise, you could just use a vast dictionary and try to cram every word of every length into every appropriate spot and then create a cross index of where letters match up where words cross.

Such an unconstrained problem is very tough to solve, so if you can "understand" the clues, then you can limit the possible answers and therefore the size of the dictionary you need to rifle through. The clues have all kinds of other cues that people process, such as a clue that ends in a question is usually tricky in some fashion.

Behind the Dr Fill program, Ginsberg has amassed a database of over 47,000 crossword puzzles and their solutions, with almost 1.9 million unique clues. The system also makes use of a dictionary with over six million words and a smaller dictionary with 8,542 common words.

The answers to all of these puzzles were then rated in a number of ways, including hand scoring them by 100 volunteers and cross-checking them for the number of Google hits, their Scrabble score if you were playing that word game, their length, and other criteria. Based on the 50,000 words, scoring was done for the remaining clues in the 47,000 puzzles.

The Dr Fill database also includes a database of Wikipedia titles, grammatical information about 154,000 words that tells Dr Fill their part of speech, and 1.2 million synonyms.

Armed with all this data and super-secret algorithms, Dr Fill solves crossword puzzles. Watson needed a rack of IBM Power 750 servers with 2,880 cores and 16TB of memory in a high-speed cluster, but Ginsberg says you can run Dr Fill on a notebook with two x86 cores and 8GB of memory with a compressed database that weighs in at 300MB.

Like many others, Ginsberg thinks that Watson cheated a bit. Humans did beat Watson in a dry run, and then IBM sped up the button pushing a bit for the real showdown.

"Watson, too, outperforms humans easily in terms of speed; its much-ballyhooed victory against human Jeopardy! competitors was probably due far more to Watson's mastery of button pushing than to its question-answering ability," writes Ginsberg in his JAIR paper. "In terms of the underlying cognitive task, Watson appears to not yet be a match for the best Jeopardy players, who are in general capable of answering virtually all of the questions without error."

Ginsberg says that the task for Dr Fill is a lot harder, because you can't decline to answer or bet to boost your earnings as you can on Jeopardy!. You have to find the right answers, period. And perhaps more significantly, Ginsberg is having a harder time as Dr Fill gets better and better of figuring out why the program is not working.

Perhaps we need to create Stallman? ®

Build a business case: developing custom apps

More from The Register

next story
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
Better be Nimble, tech giants, or mutant upstarts will make off with your sales
Usual suspects struggling to create competing products
VMware vaporises vCHS hybrid cloud service
AnD yEt mOre cRazy cAps to dEal wIth
prev story

Whitepapers

A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.