Experiments in Ngram Art


Google Ngram Viewer is a tool that allows users to search for a word or phrase in Google’s vast collection of digitized books and graph the results.

This graph shows the use of words for various technologies over time. Telegraph had its modest moment in the early 20th century. Telephone started its rise right after it had its first major public demonstration in 1876. Television had a steep increase mid-century, but was quite outdone by the sharp leap taken by computer in the second half of the century.

A lot of what you find exploring Ngram is pretty obvious. Here, a search on the word war shows major peaks in its usage during both world wars.

But you can also discover patterns that aren’t so obvious. Milk, sugar, meat, butter, flour, and cigarettes also had peaks during the world wars. In retrospect, this might seem obvious—it probably has to do with the rationing and shortages associated with wartime—but interesting to see it so clearly outlined on the basis of word use alone.

It’s important to be aware of scaling when comparing words on Ngram. The y-axis shows percentages over all the words in the Google Books corpus. Sex went from .004 percent of words in 1960 to .007 percent in the year 2000. Its rise is paralleled over the same period with that for drugs. But it appears there isn’t much to say about rock and roll.

Not so! Rock and roll had its own meteoric rise, just on a much smaller scale, percentage-wise. By 2000, it had gone from 0 to .00008 percent of all three-word chunks. (The “n” in “Ngram” refers to the number of sequential units considered as a chunk. Single words are 1-grams, two word phrases are 2-grams, three word phrases are 3-grams…). It's hard to see that when it has to share the stage with relative giants like sex and drugs.


Ngram art is something I discovered through hours of playing with Ngram. I use Ngram frequently to answer questions or satisfy my curiosity about linguistic change. Sometimes I get sucked into a zone where I’m just throwing random words and phrases at it to see what happens. One late night, I noticed that in the graph for high and low, the image matched the meaning of the words. On the graph, high was high and low was low.

Soon I was chasing those moments of serendipity down, trying to choose the right words to make line images that would relate back their meanings. It wasn’t as easy as it looks. Words are strange creatures that do not necessarily behave as you would expect when graphed over time. Still, I discovered it was possible, in a crude way, to draw with data. Here is the resulting small gallery of Ngram art.

If you’d like to try your hand at Ngram art visit the Ngram Viewer. If you’d like to know more about how Ngram works and what its results mean see this TED talk by the creators Jean-Baptiste Michel and Erez Lieberman Aiden. For information on some of the more complex searches you can do, see this Atlantic article by Ben Zimmer.

By Ben Wittick (1845–1903) - Brian Lebel's Old West Show and Auction, Public Domain, Wikimedia Commons
Photo of Billy the Kid and Pat Garrett, Purchased for $10, Could Be Worth Millions
By Ben Wittick (1845–1903) - Brian Lebel's Old West Show and Auction, Public Domain, Wikimedia Commons
By Ben Wittick (1845–1903) - Brian Lebel's Old West Show and Auction, Public Domain, Wikimedia Commons

Several years ago, Randy Guijarro paid $2 for a few old photographs he found in an antiques shop in Fresno, California. In 2015, it was determined that one of those photos—said to be the second verified picture ever found of Billy the Kid—could fetch the lucky thrifter as much as $5 million. That story now sounds familiar to Frank Abrams, a lawyer from North Carolina who purchased his own photo of the legendary outlaw at a flea market in 2011. It turns out that the tintype, which he paid $10 for, is thought to be an image of Billy and Pat Garrett (the sheriff who would eventually kill him) taken in 1880. Like Guijarro’s find, experts say Abrams’s photo could be worth millions.

The discovery is as much a surprise to Abrams as anyone. As The New York Times reports, what drew Abrams to the photo was the fact that it was a tintype, a metal photographic image that was popular in the Wild West. Abrams didn’t recognize any of the men in the image, but he liked it and hung it on a wall in his home, which is where it was when an Airbnb guest joked that it might be a photo of Jesse James. He wasn’t too far off.

Using Google as his main research tool, Abrams attempted to find out if there was any famous face in that photo, and quickly realized that it was Pat Garrett. According to The New York Times:

Then, Mr. Abrams began to wonder about the man in the back with the prominent Adam’s apple. He eventually showed the tintype to Robert Stahl, a retired professor at Arizona State University and an expert on Billy the Kid.

Mr. Stahl encouraged Mr. Abrams to show the image to experts.

William Dunniway, a tintype expert, said the photograph was almost certainly taken between 1875 and 1880. “Everything matches: the plate, the clothing, the firearm,” he said in a phone interview. Mr. Dunniway worked with a forensics expert, Kent Gibson, to conclude that Billy the Kid and Mr. Garrett were indeed pictured.

Abrams, who is a criminal defense lawyer, described the process of investigating the history of the photo as akin to “taking on the biggest case you could ever imagine.” And while he’s thrilled that his epic flea market find could produce a major monetary windfall, don’t expect to see the image hitting the auction block any time soon. 

"Other people, they want to speculate from here to kingdom come,” Abrams told The New York Times of how much the photo, which he has not yet had valuated, might be worth. “I don’t know what it’s worth. I love history. It’s a privilege to have something like this.”

[h/t: The New York Times]

Name the TV Titles Based on Their Antonyms


More from mental floss studios