Original URL: http://www.theregister.co.uk/2007/01/02/wtf_is_information_part1/

A Brief History of Information

The word that means everything - and nothing

By Ted Byfield

Posted in Bootnotes, 2nd January 2007 00:02 GMT

Part 1 Kierkegaard said that irony was "as baffling as depicting an elf wearing a hat that makes him invisible." He's lucky he never encountered information.

The word seems to stand for everything, and nothing. "Information" describes everything from a precise mathematical property of communication systems, to discrete statements of fact or opinion (for example, the time a film begins or someone's perspective on a situation), to a staple of marketing rhetoric, to a world-historical phenomenon on the order of agriculture or industrialization.

The frequency and disparity of its use, by specialists and lay people alike, to describe countless general and specific aspects of life makes it difficult to analyze; no single academic discipline or method can offer an adequate explanation of the phenomenon.

A typical approach to a problem of this kind is to start with the word as such: to gather examples of its use, codify their meanings, and arrange them into a taxonomy. This has been done with varying degrees of success; for example, one prominent American-English dictionary defines the word in slightly less than 200 words.

These efforts are admirable; but if we grant any credence at all to the widely made claim that we live in an "information society" or, even more grandly, in an "information age," then surely information must be more the sum of the word's multiple meanings. Apparently, it - the word or, more properly, the category - is sui generis, and in a particularly compelling way. What qualities would make it so?

From meaning to noise

The word itself dates in English to the late fourteenth century, and almost from the beginning showed ambiguities very similar to current usages. The Oxford English Dictionary cites early uses - in, among other sources, Chaucer's Canterbury Tales, as evidence for defining it variously as

[t]he action of informing ... communication of instructive knowledge (I.1.a)

communication of the knowledge or ... 'news' of some fact or occurrence (I.2)


[a]n item of training; an instruction (I.1.b)

That is to say, generally, an action in the first cases, and a thing in the last case.

Even the obscurity of whether it is singular or plural, which is still prevalent, seems to date to the early sixteenth century.

an item of information or intelligence," curiously "with _an_ and _pl_[_ural_]" [I.3.b])

As the word came into wider use in the centuries leading up to 1900, it took on a variety of additional meanings. Of these, the most striking trend was its increasingly legalistic aspect. This included informal usages (for example, related to or derived from "informing" on someone) as well as narrow technical descriptions of charges lodged "in order to the institution of criminal proceedings without formal indictment" [sic].

This disparity - in one aspect referring to particular allegations of a more or less precise factual nature and, in other aspects, to a formal description of a class or type of assertion - is still central to current usage of the word; so are connotations that information relates to operations of the state.

Yet it was in the twentieth century that the word was given decisively different meanings. The first of these modern usages appears in the work of the British statistician and geneticist R. A Fisher.

In his 1925 article Theory of Statistical Estimation published in Proceedings of the Cambridge Philosophical Society he described "the amount of information in a single observation" in the context of statistical analysis. In doing so, he appears to have introduced two crucial aspects to "information". Firstly, that it is abstract yet measurable, and secondly that it is an aspect or byproduct of an event or process.

"Fisher information" has had ramifications across the physical sciences, but its most famous elaboration has been in the applied context of electronic communications. These, and related definitions differ from Fisher's work, but they remain much closer to his conception than to any earlier meanings.

Three years after Fisher's paper appeared, the American-born electronics researcher Ralph VL Hartley - who had studied at Oxford University almost exactly the same years that Fisher studied at Cambridge (1909-1913) before returning to the United States - published a seminal article in Bell System Technical Journal. In it, he built upon the work of the Swedish-American engineer Harry Nyquist (who was working mainly at AT&T and Bell Laboratories), specifically on Nyquist's 1924 paper Certain Factors Affecting Telegraph Speed, which sought in part to quantify what he called "intelligence" in the context of a communication system's limiting factors.

However, Hartley's 1928 article, titled Transmission of Information seems to have fused aspects of Fisher's conception of information with Nyquist's technical context - albeit without citing either of them - or any other source. Hartley specifically proposed to "set up a quantitative measure whereby the capacities of various systems to transmit information may be compared." He also added another crucial aspect by explicitly distinguishing between "physical as contrasted with psychological considerations" - meaning more or less, by the latter, "meaning." According to Hartley, information is something that can be transmitted but has no specific meaning.

It was on this basis that, decades later, the American mathematician and geneticist-turned-electrical engineer Claude Shannon made most famous of all modern contributions to the development of the idea of information.

Shannon's PhD dissertation An Algebra for Theoretical Genetics - an application of his "queer algebra," in the words of Vannevar Bush - was written at MIT in 1940 under the direction of Barbara Burks, an employee of Eugenics Record Office at Cold Spring Harbor Laboratory. Shannon was then recruited by Bell Labs to research "fire-control systems" - automated weapon targeting and activation - "data smoothing" and cryptography during World War II.

At no point in his works did Shannon ever define "information"; instead, he offered a model of how to quantitatively measure the reduction of uncertainty in transmitting a communication, and used "information" to describe that measure.

Double negatives

Shannon's two-part article in 1948, A Mathematical Theory of Communication and its subsequent reprinting with a popularizing explanation in his and Warren Weaver's book The Mathematical Theory of Communication (Urbana: University of Illinois Press, 1949), are widely heralded as the founding moment of what has since come to be known as "information theory," a subdiscipline of applied mathematics dealing with the theory and practice of quantifying data.

Shannon's construction, like those of Nyquist and Hartley, took as its context the problem presented by electronic communications, which by definition are "noisy", meaning that a transmission does not consist purely of intentional signals. The problem they pose is how to distinguish the intended signal from the inevitable artifacts of the systems that convey it - or, in Shannon's words, how to "reproduc[e] at one point either exactly or approximately a message selected at another point."

Shannon was especially clear that he didn't mean meaning:

Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.

In The Mathematical Theory of Communication, he and Weaver explained that "information is a measure of one's freedom of choice when one selects a message" from a universe of possible solutions. In everyday usage, "freedom" and "choice" are usually seen as desirable: the more, the better. However, in trying to decipher a message they have a different consequence: the more freedom of choice one has, the more ways one can render the message - and the less sure one can be that a particular reproduction is accurate.

Put simply, the more freedom one has, the less one "knows."

It's small wonder that the author of such a theory would see efforts to apply it in other fields as "suspect".

The Fog of the New Machine

Of course, if Shannon sought to limit the application of his "information" to specific technical contexts - for example, by warning in a popularizing 1949 book that "[t]he word information, in this theory, is used in a special sense that must not be confused with its ordinary usage" - he failed miserably. The applications of his work in computational and communication systems, ranging from intimate read-write operations in storage devices to the principles guiding the design of sprawling networks, have had pervasive and catastrophic effects since their publication.

As this account suggests - and as one should expect - Shannon's work was just one result of many interwoven conceptual and practical threads involving countless researchers and practitioners working across many fields and disciplines.

In the two decades that separated Hartley's 1928 article and Shannon's publications, myriad advances had already had immense practical impact - for example, on the conduct and outcome of World War II, in fields as diverse as telegraphy, radiotelegraphy, electromechanical systems automation and synchronization, and cryptography. More generally, an important aspect and a notable result of that war were the unparalleled advances in systems integration across government, industry, and academia, from basic research through procurement, logistics, and application.

In the sixty years since, those advances have spawned many more advances - quite enough reason for "nonspecialists" to take a strong interest in information, however it is defined. Their interests, and the "popular" descriptions that result, surely carry at least as much weight as Shannon's mathematical prescription.

In the next part we'll look at more recent, popular rhetoric about "information" lines up with its ancient origins. And discover how today's philosophers make for strange bedfellows with the sloppy purveyors of post-modernist marketing.®

Ted Byfield is Associate Chair of the Communication Design and Technology Department at Parsons the New School for Design in New York City; he co-moderates the Nettime-l mailing list. This article is based on an essay in Matthew Fuller (ed.), Software Studies: A Lexicon (Cambridge, Mass: MIT Press, forthcoming 2007).