AI in Medicine? It's back to the future, Dr Watson

Why IBM's cancer projects sounds like Expert Systems Mk.2

Why didn't MYCIN get better?

Stanford had compared MYCIN to the work of eight experts at Stanford medical school. Out in the real world it was deemed unfit for purpose, and went unused.

MYCIN's Achilles' heel was predicted in advance by the leading AI critic (and tormentor) Hubert Dreyfus. Not all knowledge can be finessed into "rules". This difficulty was acknowledged by the "Father of Expert Systems" Ed Feigenbaum, an academic who established the Stanford Knowledge Systems Lab, and whose 1983 book The Fifth Generation: Artificial Intelligence and Japan's Computer Challenge to the World created a huge revival in AI investment – and something close to panic in American business and government. Then, as now, we worried about an "AI gap" with Asia.

Dreyfus pointed out that a real human expert used a combination of examples and intuition based on experience. Replicating this might prove elusive:

"If internship and the use of examples play an essential role in expert judgement, ie, if there is a limit to what can be understood by rules, Feigenbaum would never see it – especially in domains such as medicine, where there is a large and increasing body of factual information."

Ah, yes, said Feigenbaum. We have that in hand.

A new job category was then born – that of "knowledge engineer". A knowledge engineer was sent in when the human expert didn't realise, or couldn't express, how she or he had come to a decision. A knowledge engineer was like a horse whisperer, taming the wild and elusive knowledge floating around in the expert's head, and then bottling it, for an expert system to use.

So how did that go?

Not learning from failure

Summarising the field at the end of the 1980s, academics Ostberg, Whitaker and Amick* discovered that the reality...

[bore] little resemblance to the success stories reported in high-priced insider newsletters. As of this date, we find that there are very few operational systems in use worldwide ... The literature, including the expensive and supposedly insightful expert systems newsletters, has consistently overstated the degree of expert systems penetration into the workplace. However, even the outright failures have not dissuaded organizations from pursuing the technology; they have simply categorized previous efforts as learning experiences.

Feigenbaum himself despaired. Perhaps expert knowledge was simply "10,000 special cases", he mused. It would always evade capture.

The great horse whispering project had failed.

Shortliffe discovered that MYCIN's failure to move into full use was in large part due to the ability of experts to decide how they wished to carry out their tasks – "if the tool was not directly helpful to how they wished to work, then it would simply not be used," notes Philip Leith in his study, The Rise and Fall of Legal Expert Systems, a fascinating analogy for the failure of 1980s AI in medicine.

"What was missing was proper analysis of user needs – vital in any other area of computer implementation, but apparently viewed as unnecessary by the AI community. This produced a mismatch between what the experimenter believed the user wanted and what the user would actually use."

AI is only part of what Watson for Health in reality does, and AI today is very different to the AI of the '60s, '70s and '80s: probabilistic AI takes a brute force approach, using large data sets. In some cases, such as speech recognition, this has been fantastically successful. In others it's still quite useful. But in many other situations, it isn't.

MYCIN failed, but it didn't just fail because the AI stubbornly couldn't improve. Like IBM's Watson today, it was immensely time-consuming for staff using the system, relying entirely on data input. The knowledge base was incomplete, and it was used in areas it shouldn't have been used.

In its weighty investigation, StatNews found another reason for Watson for Oncology not living up to its billing: culture. Hospitals in Denmark and the Netherlands had declined to sign up because it was too US-centric, "putting too much stress on American studies, and too little stress on big, international, European, and other-part-of-the-world studies", according to one source quoted.

Today we're in another Tulip Mania phase of AI: Softbank's singularity-obsessed boss has pledged that much of his $100bn tech fund will focus on AI and ML investments.

Today MYCIN will be recalled by thousands of students who studied computer science in the 1980s, as it became the canonical example of an expert system. That was the fate of many such systems: useless in the field, they were used for teaching. The next generation would be better.

Or not.

"It's not happening today, and it might not be happening in five years. And it's not going to replace doctors," one healthcare VC admitted to MIT Technology Review. ®

* Ostberg, O, Whitaker, R, and Amick, B [1988], The Automated Expert: Technical, Human, and Organizational Considerations in Expert Systems Applications, Stockholm: Via Teldok report 12, 1988.


Biting the hand that feeds IT © 1998–2017