Feeds

Humans best crossword-puzzling computer

Dr Fill is no Deep Blue or Watson – yet

Remote control for virtualized desktops

Officially, humans are the only ones who can enter the American Crossword Puzzle Tournament, which was held in Brooklyn, New York over the weekend. But this time artificial intelligence expert Matt Ginsberg of On Time Systems has put his Dr Fill crossword solver to the test. As the results show, you don't have to throw out your pencil just yet.

This week, the puzzle masters at the ACPT put the hurt on Dr Fill, who according to a report in the New York Times, did terribly on two out of the five crossword puzzles and would have only ranked 141st among the 600 people that took the test. The program has done better in the past, and in simulated runs among fifteen former tournaments, Dr Fill came out on top three times.

"I'll be back next year," Ginsberg told the Times.

Ginsberg's company, located in Eugene, Oregon, is a niche player in industrial optimization. Among other things, On Time Systems has created algorithms behind the Green Driver application used in the cities of Eugene and Portland to anticipate red lights and to route around them, which can cut commute times by 5 per cent.

The company has also created the algorithms for routing the US Air Force's fleet of cargo aircraft around weather while still obeying air traffic rules. The Worldwide Aeronautical Route Planner was created to run on mainframes and its being modernized to run on laptops. The optimized routing saves the Air Force something on the order of 1 to 2 per cent on its fuel bill each year, which works out to $35m to $90m.

In his spare time, in addition to actually creating crossword puzzles for the Times, Ginsberg has been working on the Dr Fill program. Ginsberg submitted a paper describing Dr Fill to the Journal of Artificial Intelligence Research, which was published (PDF) in December 2011. Based on its past performance, Ginsberg reckons that Dr Fill ranks among the top 50 puzzle solvers in the world, even though it did not have a very good day on Sunday.

The Watson Jeopardy! question-answer system built by IBM that took on the two human champs (perhaps chimps or chumps might be more appropriate, given how badly Watson spanked Ken Jennings and Brad Rutter) did terribly at first, too, but with tweaks and tuning and algorithm changes, it got better.

So there's no question that Ginsberg will be back, and it might even turn out that he gets some help. The Gray Lady is Big Blue's hometown rag, so some collaboration might be in order. The puzzles that threw Dr Fill through a loop had words spelled backwards and some that had words diagonally as well as horizontally and vertically.

As with the Watson QA machine, Dr Fill is not actually understanding what it is doing, but using statistical probabilities to calculate its answers. Technically, what Dr Fill does is convert a crossword puzzle into a singly weighed constraint satisfaction problem. The constraints, of course, are the clues. Otherwise, you could just use a vast dictionary and try to cram every word of every length into every appropriate spot and then create a cross index of where letters match up where words cross.

Such an unconstrained problem is very tough to solve, so if you can "understand" the clues, then you can limit the possible answers and therefore the size of the dictionary you need to rifle through. The clues have all kinds of other cues that people process, such as a clue that ends in a question is usually tricky in some fashion.

Behind the Dr Fill program, Ginsberg has amassed a database of over 47,000 crossword puzzles and their solutions, with almost 1.9 million unique clues. The system also makes use of a dictionary with over six million words and a smaller dictionary with 8,542 common words.

The answers to all of these puzzles were then rated in a number of ways, including hand scoring them by 100 volunteers and cross-checking them for the number of Google hits, their Scrabble score if you were playing that word game, their length, and other criteria. Based on the 50,000 words, scoring was done for the remaining clues in the 47,000 puzzles.

The Dr Fill database also includes a database of Wikipedia titles, grammatical information about 154,000 words that tells Dr Fill their part of speech, and 1.2 million synonyms.

Armed with all this data and super-secret algorithms, Dr Fill solves crossword puzzles. Watson needed a rack of IBM Power 750 servers with 2,880 cores and 16TB of memory in a high-speed cluster, but Ginsberg says you can run Dr Fill on a notebook with two x86 cores and 8GB of memory with a compressed database that weighs in at 300MB.

Like many others, Ginsberg thinks that Watson cheated a bit. Humans did beat Watson in a dry run, and then IBM sped up the button pushing a bit for the real showdown.

"Watson, too, outperforms humans easily in terms of speed; its much-ballyhooed victory against human Jeopardy! competitors was probably due far more to Watson's mastery of button pushing than to its question-answering ability," writes Ginsberg in his JAIR paper. "In terms of the underlying cognitive task, Watson appears to not yet be a match for the best Jeopardy players, who are in general capable of answering virtually all of the questions without error."

Ginsberg says that the task for Dr Fill is a lot harder, because you can't decline to answer or bet to boost your earnings as you can on Jeopardy!. You have to find the right answers, period. And perhaps more significantly, Ginsberg is having a harder time as Dr Fill gets better and better of figuring out why the program is not working.

Perhaps we need to create Stallman? ®

Beginner's guide to SSL certificates

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
DEATH by COMMENTS: WordPress XSS vuln is BIGGEST for YEARS
Trio of XSS turns attackers into admins
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Designing and building an open ITOA architecture
Learn about a new IT data taxonomy defined by the four data sources of IT visibility: wire, machine, agent, and synthetic data sets.
10 threats to successful enterprise endpoint backup
10 threats to a successful backup including issues with BYOD, slow backups and ineffective security.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.