Google and the End of Science

Original URL: https://www.theregister.com/2008/07/09/anton_wylie_google_science/

Bringing it all back Hume

Posted in Science, 9th July 2008 10:56 GMT

WiReD magazine's editor-in-chief Chris Anderson has just seen the end for scientific theories. And it is called Google.

This remarkable revelation was triggered by Google's research director Peter Norvig. Speaking at O'Reilly's Emerging Technology Conference in March. Norvig claimed: "All models are wrong, and increasingly you can succeed without them" - a reference to Google's success at linking web-pages with users. Anderson has generalized that idea to science as a whole in a piece titled The End of Theory: The Data Deluge Makes the Scientific Method Obsolete:

"This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves."

Anderson contends that the same applies for all science - its models are inherently of limited value. Either they are wrong, for example they "caricature... a more complex underlying reality" (quantum mechanics), or we don't know how to prove them experimentally (string theories about the universe), or they raise more questions than they answer (epigenetics in biology).

Yet increasing computing power, both in hardware and statistical analysis algorithms, can still bring forth useful correlations, and new interesting discoveries. Anderson cites Craig Venter's DNA sequencing: having done with sequencing individuals, "in 2005 he started sequencing the air. In the process, he discovered thousands of previously unknown species of bacteria and other life-forms."

"The opportunity is great", he adds, because "correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all."

Over at Ars Technica, John Timmer evinces shock: "I can't possibly imagine how he comes to that conclusion."

He objects: "Correlations are a way of catching a scientist's attention, but the models and mechanisms that explain them are how we make the predictions that not only advance science, but generate practical applications."

The advancement of science is not itself at issue, but the actual examples Timmer counters with do not seem to convince even Timmer himself.

The royal road would be to demonstrate that models are crucial to science, which would be grounds for thinking that they are logically necessary. Timmer takes the short cut on pragmatic grounds: models have utility, regardless of their truth or falsity. Models, so to speak, make the scientific establishment go around.

"Would Anderson be willing to help test a drug that was based on a poorly understood correlation pulled out of a datamine?" Timmer challenges, apparently unembarrassed to be seen in flagrante putting an ad hominem argument. Of course not, which is why we test on guinea pigs. (And why should Anderson be first?)

But if anything, this is a reason Anderson could use. With sufficiently good correlations, it might finally be possible to spare guinea pigs, chimpanzees, or rats trial by laboratory testing.

The irony here is that eudemonic theories of ethics, which is to say the good and right thing to do is to create happiness, such as hedonism (for me) or utilitarianism (for all of us), are philosophically shakier than statistical inference. Anderson's contention is that technology is changing and, in comparison with the continued daily rising of the sun, the outlook for models and mechanisms on inductive grounds seems less sunny than that of tomorrow's dawn.

A closer shave with history

From an initial pass over the arguments for and against Anderson's "end of theory" claim, it seems that several theories about the justification of science might also have to be added to his hit-list. This is what makes Anderson's argument interesting - an analogue perhaps of "the end of history" claim by Francis Fukuyama in his eponymous book.

How could it happen that Occam's Razor, the (ahem) eponymous principle that explanations should not rely on unnecessary entities, has grown so big that it now threatens to sever the hand that once so securely held it - the hand of scientific practice?

Before addressing that, we should be aware of a slippery complexity - semantics. It is not only Google that "washes" meanings, as The Register's Andrew Orlowski noticed.

The term "model" at one time connoted a physical representation, in scientific context and ordinary contexts, for example, of an atom. It seems now to be used in science to cover a wider range of things: not only the virtual representation of physical models (computer modelling and simulation), but any explanatory matrix where two concepts are mediated by other concepts. Pushed this far, it can be difficult to draw the line between a model and an explanation. And between hypothesis, conjecture, theory, and mechanism. Hold the thought as you read on.

History, also, contains warnings. The Copernican Revolution saw two competing models fighting it out - the old Earth-centred view of the universe, and a sun-centred solar system. One difference was that the traditional model could not account for the phases of Venus - a fact which anyone with the couple of lenses for one of Galileo's new-fangled telescopes could check for themselves. But science currently has some "facts" where no telescope whatever can bring the entities they are about into view (black holes, dark matter).

But don't draw any conclusions just yet from this, either.

Perhaps Anderson has all this in mind in predicting a common fate for "theory" in the dustbin of history. It does though make it harder to see if Copernicus really was Google's ancestor.

What is easier to see is that Anderson's thesis has historical pedigree.

Back home with Hume

The justification of science often looks to the Scottish enlightenment philosopher David Hume. But it is also with Hume that modern scepticism about causation starts.

Hume was wondering if ethics could be like science: something accessible to ordinary folk for reasoning and deciding about, and not reliant on the diktat of Authority in the form of religion or other esoteric specialists, with their resort to inaccessible realms. These targets are in plain view in the famous "consign it to then the flames" conclusion of his 1748 Enquiry Concerning Human Understanding. But Hume noticed a problem.

Hume starts from the position that valid ideas about the world can be traced back either to an origin in perception, or to derivation from other valid ideas. He then finds this is not possible for a few ideas - causation is one, substance is another - which had been traditional stomping grounds of metaphysics. No amount of perceiving, says Hume, seems to show us causation itself; what we see are two events that regularly occur together.

Now this does not, as commonly supposed, make Hume sceptical about causation. His argument against a God that is the first ever cause of everything which came after, depends on it. Nor does it make him sceptical about knowledge in general. Some knowledge, says Hume, we can demonstrate to be true. For the rest, if we use a valid form of inference to draw a conclusion that is true - if we check it empirically - then it follows that if the premises are true, as a matter of logic we are assured that the conclusion is true, also.

What Hume's argument really shows is that causation cannot be justified starting from where he starts - namely the premise (from Aristotle) that logic and experience together account for all the sources of our knowledge.

What in Hume is a philosophical problem about the justification of causation, has been confused with that of the justification of knowledge.

For Empiricists, who support the general thesis that all knowledge is from experience, the issue devolved into determining the criteria and constraints for the scientific application of causation. Namely, how much regularity of occurrence is necessary to treat two things as causally connected, and what other conditions should be attached.

Enter a rabbit, wearing a waistcoat

Historically then, around the start of the 20th century, there is intellectual uncertainty in some minds about causation. Along with prevailing tendencies in physics, it is enough to cause scientists like Poincaré and Mach to abandon the correspondence theory of truth, and float an alternative conventionalist theory of truth. Let's not worry, it goes, precisely how our scientific conclusions relate to the world - let's agree that truth is what we can agree on so we can take business forward. (The demolition of this theory is according to taste - one for connoisseurs is by V.I. Ulyanov - aka Lenin).

Hark, though, and you can hear here the ghost of Hume. Anderson's "end of theory" has started growing roots.

The next significant development occurs when Wittgenstein encounters Russell busy deriving mathematics from logic. (A not inappropriate activity you might think for an Empiricist philosopher). Wittgenstein realizes that Russell's symbolic calculi might be useful for describing formally not only mathematics but what we know about the natural world.

Note how this extends the notion of modeling to the domain of abstractions and generalities. A statement like "All swans are white" cannot be modelled by a white swan, even if a white swan can be modelled. Fixing this conundrum became an ongoing project for logicians, and the arguments about whether the fixes are good is still a live one.

So now the logical germ of Anderson's thesis is sprouting leaves and branches. In Wittgenstein's hands, modelling turned into logical modelling, clouding the clear difference between things and ideas. But the rising sap is about to accelerate.

Eating up the tree of empiricism

Noticing that Wittgenstein had in effect slid the thin edge of a wedge between descriptions of the world and other sorts statements, the logical positivist philosophers (Schlick, Carnap, Ayer, et al) banged it some more. They arrived at their "verification principle": statements are meaningful only if they are verifiable - or, if like logic and mathematics, they are reducible to tautologies, which don't need empirical validation.

It was soon asked if this verification principle was true. It was clearly far from tautological. Also it looked like it would take scientists all their time for ever to verify it. So the verifiability principle was hedged to read "verifiable in principle" - an interesting twist.

Much merriment in scientific circles was then had with this new easy-to-handle intellectual toy. Waved contre-temps, it could patently banish metaphysical entities at a stroke. Among the metaphysical entities vanquished, though, was causation, on the basis that probability theory was a philosophically viable interpretation of causation. Thus set free from its referent, causation became a loose synonym for "correlation", then for "statistical link", and finally for "link".

In fact, the logical positivists' magic wand of verification turned out to have power on a scale that Harry Potter might only dream about and J.K. Rowling clearly never did.

Much of what people say to each other became at a stroke intellectually dubious. Value judgements (eg, "Tracy Emin makes great art") and statements with "ought" and "should" in them (eg, Timmer's conclusion about what science should do) seem to be unverifiable. One philosophical account interprets them as civilized ways of emoting and persuading. Bye-bye ethics?

The concept of the mind, and by extension that of a person, was also affected, with far reaching implications.

In psychology, Behaviourism was one favoured development. Its ontology does not include people with minds, only biological entities with patterns of behaviour. The rise and rise of neuro-science is correlated with this. Another is politics. The New Labour government in the UK boasts almost daily that it is in the business of "modifying behaviour".

Even when this type of thinking is felt to be repugnant, the tendency remains to treat people as parametrically determined objects. The phrase "hearts and minds" admits that people feel and think, but implies that what matters is to ascertain which feelings and thoughts affect them most strongly. Modern politics consists to a large extent of this type of appeal, and that part conducted through the media, almost exclusively.

(It also suggests governments will stagger on until their last gasp, on the assumption that the appropriate corrections to "hearts and minds", based on focus groups, are all that is necessary to revive them for power for ever).

Another effect of logical positivism has been the way that its verification principle has changed the shape of concepts of explanation in regard to literature, film, etc during the 20th century. Modern literary criticism no longer deals directly with intentions, values, and the real concerns of people, but treats them indirectly, via an interpretative "theory", such as psychology, Marxism, or via a reconstruction (interpreted histories of race or gender). In the post logical positivist world, semiotics is to the humanities only what the measuring instrument is to the scientist - a tool to build theories with.

One curiosity with the historical record remains. And it's an important one, because without it, we can't fully understand the implications of this discussion Which is - who were the logical positivists waving their magic wand at?

The debate in Hume's time between Empiricists and Rationalists had run out of takers over a century earlier. By the early 20th century, the positivists' purported target, the metaphysician, hardly existed as more than a memory. Institutionalized religion was of course no threat. An answer to this is not entailed by the history of ideas I've described; it would require delving into the history of the actual people concerned, which is beyond the scope of this article.

The absence of metaphysicians, and the subsequent 20th century repurposing of philosophy so it could not easily steer in that direction either, had a profound consequence. When scientists ran out of gas in criticizing each others' theories, there was no-one else with the interest or the expertise to row the boat. What happened next is another fascinating story, particularly in the (largely unconscious) handling by physics of the metaphysical concept of substance.

Between a metaphysical horned entity and an undulating patch of blue

The historically influential ideas in science I've described are all consonant with Empiricism. This notion has been a philosophical constant, even as different theories and justifications of science and scientific method arose, revolved around it, and fell away with new turns of history. The connection between science and empiricism has become identification - empiricism is spoken of as the methodology of science.

But from Hume's quite specific (and contingent) difficulty with causation, through to logical positivism's broad (and necessary?) demarcation of science, it's possible to separate the preoccupations of philosophers and those of scientists. The former have managed, despite the success of the latter, to put into doubt one or other part of the narrative (for want of a better word) of what constitutes science and its method. The difficulties have been negotiated rather than resolved. Which leaves the state of play where?

At a time when reinterpreting concepts like "causation" is seen as methodologically viable, then abolishing referents in favour of interpretations obtains legitimacy as a means of justification. But it also sets the stage for interpreting observational evidence, the basic raw material of science, with the focus on consistency with extant interpretations. The logic of some contemporary scientific explanations would not particularly disconcert ancient Greeks, who explained events in terms of the action of divers gods on obdurate matter.

If being seen to dabble with occult agencies is unacceptable to scientists, it may be thought that Anderson's option is viable - is it such a big deal for a scientist to hack up a bit of code? An ontology-free science would still be able to distinguishing itself from myths of sundry sorts, while demonstrating superior efficacy in practise.

Anderson's thesis of the "end of theory" is the logical consequence of letting philosophical Empiricism set the agenda for the "reflective practise" (to use a current buzz-word) of scientists. In effect it is not very different from the conclusion reached by Bishop Berkeley, Hume's philosophical predecessor, in regard to knowledge - things exist only as long as I am perceiving them.

Anderson has recognized that when instruments take the measurements for science, human perception is no longer relevant within the Berkelian epistemology. Cue Occam's razor. The novelty in Anderson's thesis is the assertion that technology has advanced to the point where the empiricist theory of knowledge can be executed as a practical program. Concerns about job security are the least of it. But then Anderson is not doing science, but metaphysics.

However, Anderson's identification of Google, everyone's favourite internet research tool, as his engine of choice for the destruction of science is not the only way forward. The alternative suggested here is to revisit the premises of the historical arguments - and reject empiricism as a metaphysical basis for science. Its disjunction of logic and experience as accounting fully for the sources of knowledge remains an open invitation to all-comers. Hume effectively knew it, and there is no stigma attached to citing Hume, as his rigour and subtlety as a philosopher continues to inspire even today.

So the irony is that science, having made its tribal lay with the philosophical school of empiricism over three centuries ago, and seemingly having derived sustenance from it, now has to kill it to go forward. The alternative for scientific theorising, if Anderson is correct, is to be killed by it - by it and Google.

Never has hard thinking been more required.®