Thicker lines indicate more content-sharing between 19th century newspapers. Image: Ryan Cordell / Infectious texts project
The story had everything — exotic locale, breathtaking engineering, Napoleon Bonaparte. No wonder the account of a lamplit flat-bottom boat journey through the Paris sewer went viral after it was published — on May 23, 1860.
At least 15 American newspapers reprinted it, exposing tens of thousands of readers to the dank wonders of the French city’s “splendid system of sewerage.”
Twitter is faster and HuffPo more sophisticated, but the parasitic dynamics of networked media were fully functional in the 19th century. For proof, look no further than the Infectious Texts project, a collaboration of humanities scholars and computer scientists.
The project expects to launch by the end of the month. When it does, researchers and the public will be able to comb through widely reprinted texts identified by mining 41,829 issues of 132 newspapers from the Library of Congress. While this first stage focuses on texts from before the Civil War, the project eventually will include the later 19th century and expand to include magazines and other publications, says Ryan Cordell, an assistant professor of English at Northeastern University and a leader of the project.
Some of the stories were printed in 50 or more newspapers, each with thousands to tens of thousands of subscribers. The most popular of them most likely were read by hundreds of thousands of people, Cordell says. Most have been completely forgotten. “Almost none of those are texts that scholars have studied, or even knew existed,” he said.
Yellow dots represent hotspots of recycled content. The shading indicates the number of publications in a region recorded by the 1840 census. Image: Ryan Cordell / Infectious texts project
The tech may have been less sophisticated, but some barriers to virality were low in the 1800s. Before modern copyright laws there were no legal or even cultural barriers to borrowing content, Cordell says. Newspapers borrowed freely. Large papers often had an “exchange editor” whose job it was to read through other papers and clip out interesting pieces. “They were sort of like BuzzFeed employees,” Cordell said.
Clips got sorted into drawers according to length; when the paper needed, say, a 3-inch piece to fill a gap, they’d pluck out a story of the appropriate length and publish it, often verbatim.
Fast forward a century and a half and many of these newspapers have been scanned and digitized. Northeastern computer scientist David Smith developed an algorithm that mines this vast trove of text for reprinted items by hunting for clusters of five words that appear in the same sequence in multiple publications (Google uses a similar concept for its Ngram viewer).
The project is sponsored by the NULab for Texts, Maps, and Networks at Northeastern and the Office of Digital Humanities at the National Endowment for the Humanities. Cordell says the main goal is to build a resource for other scholars, but he’s already capitalizing on it for his own research, using modern mapping and network analysis tools to explore how things went viral back then.
Counting page views from two centuries ago is anything but an exact science, but Cordell has used Census records to estimate how many people were living within a certain distance of where a particular piece was published and combined that with newspaper circulation data to estimate what fraction of the population would have seen it (a quarter to a third, for the most infectious texts, he says).
He’s also interested in mapping how the growth of the transcontinental railroad — and later the telegraph and wire services — changed the way information moved across the country. The animation below shows the spread of a single viral text, a poem by the Scottish poet Charles MacKay, overlaid on the developing railroad system. The one at the very bottom depicts how newspapers grew with the country from the colonial era to modern times, often expanding into a territory before the political boundaries had been drawn.
Another approach takes advantage of the same type of network analysis tools social scientists use to map the flow of information in Twitter and other social media. Cordell has found that certain cities that don’t get much attention from scholars were actually important hubs in the 19th century information economy. Nashville is one of them. “It makes sense if you think about it,” Cordell said. “It was the center of the country at the time.”
Some of the texts that went viral in the 1800s aren’t all that different from the things people post on Facebook today, Cordell says. Political rants were popular, for example, as were recipes and travel stories.
Poems also turn up frequently, as well as another type of writing Cordell calls vignettes. These are sentimental stories that are presented as if they’re real, but aren’t attributable to an author and lack details that would make it possible to verify them. One example is a letter, supposedly tucked into a book by a dying woman and found by her husband after her death. She urges him to remember her fondly and live a good life after she’s gone. “These are fascinating to me because they blur the line between fact and fiction, which sort of exemplifies the 19th century newspaper,” Cordell said.
The vignettes often had a moral to them. On popular variety, temperance stories, were aimed at getting drunks to sober up. Cordell likens these cautionary tales to the email you’ve probably gotten from a concerned aunt or uncle that turns out to be based on a bogus urban legend when you look it up on Snopes.
The media may have changed, but a century and a half later many of the messages are surprisingly familiar.