How is language evolving on the internet? In this series on internet linguistics, Gretchen McCulloch breaks down the latest innovations in online communication.

Do you pronounce gif with a hard g as in get or with a soft g as in gem? It's a question that people won't stop arguing about on the internet.

But why are we so confused? Why is each camp so passionate about being right? And is there in fact a right way to pronounce gif?

Sure, the creator of the gif, Steve Wilhite, prefers a soft g, and sure, gif originated as an acronym for graphics interchange format, but inventors aren't always good at naming (the zipper was originally called the "clasp locker"), and acronyms aren't always pronounced like their roots (the "a" in NATO isn't the same as the "a" in Atlantic). In truth, language is far more democratic.

So Michael Dow, a linguistics professor at Université de Montréal, decided to investigate a different way, and I talked with him about his findings. The idea is, people decide how to pronounce a new word based on its resemblance to words they're already familiar with. So we can all agree on how to pronounce snapchat because it's made up of familiar words snap and chat, and we don't have any problems with blog because it rhymes with frog, log, slog, and so on, but we have no idea how to pronounce doge because there aren't any other common English words that end in -oge.

The problem with gif isn't the back half—we already know how to pronounce if. The problem is the front half: Does the i make the g soft or not? It's clearly not an absolute yes or no—there are English words in both categories: gift has a hard g before i, whereas gin has a soft g before i. What matters is the frequency. So Dow looked at a large corpus of 40,000 unique words with their frequency and pronunciation taken from The English Lexicon Project. Of these words, how many were like gift (hard g) and how many were like gin (soft g)?

Dow found 105 words in the corpus that had "gi" somewhere in their spelling, not counting variations on the same word, like gift/gifts or geography/geographical. At first glance, it looks like the gin group wins—there were 68 "gi" words that were pronounced with a soft g as in gin, but only half as many (37) that were pronounced with a hard g as in gift

Case closed? Not so fast. Although there are more soft g words, they don't get used as often. The least common word in the entire list was "tergiversate," which I had to look up—it apparently means "make conflicting or evasive statements; equivocate," and it's pronounced with a soft g. Rounding out the bottom eight are some more soft "gi" words you probably don't use every day: "gimcrack," "excogitate," "elegiac," "flibbertigibbet," "corrigible," "gibbet," and "giblet." Hard "gi" words don't show up until the ninth and tenth least common: "muggins" and "girt."

By contrast, the most-used words tend to be pronounced with hard g: Dow found that hard "gi" words were used overall around 10 thousand times in the corpus, whereas soft "gi" words were only used 4 thousand times. And our most-frequent list starts with four hard g words: "give" (#1), "begin" (#2), "girl" (#4) compared to "magic" (#3) and "engine" (#5). And "give" in particular is extraordinarily common—it's used almost four times as much as the next most common word, "begin." 

So in order to know what expectations we're approaching an unfamiliar "gi" word with, we need to balance the fact that there are twice as many soft g words but we use the hard g words twice as often—and it turns out, when Dow did a calculation known as the log frequency that does exactly this, the hard g words and the soft g words end up almost exactly the same.

And it doesn't matter what else we take into consideration. Want to compare only words that begin with "gi," to avoid the potential confounds of "magic" or "begin"? Again, when we take all factors into consideration, Dow found that they were the same.

Want to compare only monosyllables, and avoid "giant" or "forgive"? Yep, still the same.

In other words, when you see a new word starting with "gi," your previous exposure to "gi" words is basically telling you to flip a coin—it's just as likely that you'll decide to pronounce it with a hard g as with a soft g. And you'll never find an overwhelming enough piece of counter-evidence to get you to change your mind. Which probably means we'll be fighting the gif pronunciation war for generations to come.