‘Scalable reading’ is my term for Digitally Assisted Text Analysis or, if you like acronyms, DATA.  I owe the term partly to Franco Moretti , who some years ago coined the term ‘distant reading’ as a way of challenging the hallowed practice of ‘close reading’ and drawing attention to distinctive affordances of texts in digital form. I also owe a debt to the engaging How to Talk About Books you Haven’t Read  in which  Pierre Bayard tells you about the very ancient art of somehow gathering just as much knowledge about a book to say something clever about it even if you have not read all or even most of it. Because there have always been more books than you could or wanted to read this has been a very important skill.  For a while I talked about The Importance of Not-Reading, and I remembered a poem by Christian Morgenstern, an early twentieth century German poet famous for his nonsense poems many of which bear witness to his philosophical and mystical leanings:

Die Brille The Spectacles
Korf liest gerne schnell und viel;
darum widert ihn das Spiel
all des zwölfmal unerbetnen
Ausgewalzten, Breitgetretnen.
Korf reads avidly and fast.
Therefore he detests the vast
bombast of the repetitious,
twelvefold needless, injudicious.
Meistens ist in sechs bis acht
Wörtern völlig abgemacht,
und in ebensoviel Sätzen
läßt sich Bandwurmweisheit schwätzen.
Most affairs are settled straight
just in seven words or eight;
in as many tapeworm phrases
one can prattle on like blazes.
Es erfindet drum sein Geist
etwas, was ihn dem entreißt:
Brillen, deren Energieen
ihm den Text – zusammenziehen!
Hence he lets his mind invent
a corrective instrument:
Spectacles whose focal strength
shortens texts of any length.
Beispielsweise dies Gedicht
läse, so bebrillt, man – nicht!
Dreiunddreißig seinesgleichen
gäben erst – Ein – – Fragezeichen!!
Thus, a poem such as this,
so beglassed one would just — miss.
Thirty-three of them will spark
nothing but a question mark.

The poem anticipates modern technologies of  text summarization and shrewdly points to what might get lost in such an endeavour.  But in the end neither ‘distant reading’ nor ‘not-reading’ seemed to express adequately the powers that new technologies bring to the old business of reading.  And both terms implicitly set the ‘digital’ into an unwelcome opposition to some other — a trend explicitly supported by the term  “Digital Humanities” or its short form DH, which puts phenomena into the ghetto of an acronym that makes its practitioners feel good about themselves but allows the rest of the humanities to ignore them.

The charms of Google Earth led me to the term Scalable Reading as a happy synthesis of ‘close’ and ‘distant’ reading. With Google Earth you can zoom in and out of things and discover that different properties of phenomena are revealed by looking at them from different distances. If you stand at the corner of Halsted and Division you cannot see the North-South oblong of Chicago’s street grid or the fact that the city is located at the southern tip of a very large lake. Both of these important facts about Chicago become visible as you zoom out. Tara MacPherson has drawn my attention to  Powers of Ten, a 1968 documentary  by Charles and Ray Eames.  They are better known today for the Eames chair (1956), but for generations of middle school kids the seventies and eighties Powers of Ten  offered a first glimpse into the mysteries of the universe when contemplated “at scale” or rather at different scales.

Scalable reading, then, does not promise the transcendence of reading, –close or otherwise — by bigger or better things. Rather it draws attention to the fact that texts in digital form enable new and powerful ways of shuttling between ‘text’ and ‘context.’ Who could complain about tools that let you rapidly expand or contract your angle of vision?

The splash page for this site quotes  lines from Milton’s Paradise Lost, where Satan’s shield is compared to the moon as seen by Galileo through a telescope. These are very famous lines about the thrills of discovery at the dawn of modern science are not accidentally associated with Satan. Anxiety about technological change is an old thing, and nowhere are such changes more pronounced than in language technologies (think of Plato and writing).  In the early sixteenth century the Abbot of Sponheim fulminated in print against the evils of print, and Filippo di Strata observed that “the pen is a virgin, the printing press a whore.”   “L’ordinateur est un instrument de déshumanisation de la recherche et de la désincarnation du vivant” a dissertation supervisor is alleged to have written only a few years  to a French doctoral student (The computer is an instrument of the dehumanization of research and of the disincarnation of the living”).

He could have cited Goethe’s Mephistopheles who gave this advice to a freshman:

Wer will was Lebendigs erkennen und beschreiben,
Sucht erst den Geist heraus zu treiben,
Dann hat er die Teile in seiner Hand,
Fehlt leider! nur das geistige Band.
Whoever wants to know and describe something living
Will first seek to expel its spirit,
Then he will have the parts in his hand,
Alas! the spiritual link will be missing.


Such rhetoric invariably transforms Paul’s “the letter killeth, but the spirit giveth life” (2 Cor 3:6) into some version of “once the spirit gave life, but now the letter killeth.” How does this square with the fact that in this enterprise of replacing a living body with the dead inventory of its parts the original sin was committed by the medieval monks who invented what we still call a ‘concordance’?  Faced with the task of understanding the complexity and infinite harmony of the Word of God but keenly aware of the limitations of their memory, they hit on the divide-and-conquer strategy of turning the Bible into an alphabetically sorted list of its words and their locations. A very mechanical procedure, but a great help in going from a difficult word here to other occurrences of it there, pondering the connections that had eluded fallible memory, and constructing from them the “concordance” of God’s words with each other and with charity, the axiom of Augustinian hermeneutics. The monks “killed” the text by dividing it into its letters, but this was part of a strategy to bring back rather than drive out the “spirit.” Not all the monks succeeded all the time. But abusus non tollit usum.

It is the same with the large digital corpora that in principle support scalable reading (although the practice still lags far behind the possible). Strip a fancy text retrieval system to its basic operations, and you find a concordance on steroids, a complex machine for transforming sequential texts into inventories of their parts that can be retrieved and manipulated very fast. But when it comes to finding “das geistige Band” or, in modern parlance, “connecting the dots” modern readers are pretty much in the same situation as medieval monks, even (or especially) when the machine uses algorithms to construct statistically based patterns. No machine can tell you whether  a pattern “makes sense.”  Call this the “last mile problem” of human understanding.

Or remember the anecdote about Dr. Johnson on his deathbed.  He quoted Macbeth to his doctor:

Canst thou not minister to a mind diseased,
Pluck from the memory a rooted sorrow,
Raze out the written troubles of the brain
And with some sweet oblivious antidote
Cleanse the stuffed bosom of that perilous stuff
Which weighs upon the heart?

Dr. Johnson was much comforted when the doctor responded with the words of Macbeth’s doctor:

     Therein the patient
Must minister to himself.