Original image

4 Ideas From Linguistics to Help You Appreciate Arrival

Original image

Spoiler Warning: If you haven't seen Arrival and plan to soon, you might want to save this article for after.

The most exciting thing about Denis Villeneuve’s new sci-fi space-encounter movie isn’t the aliens or the spaceships or the worldwide panic they bring on. It’s the fact that the hero is a linguistics professor!

It’s nice to feel that your seemingly esoteric field is actually the key to saving humankind. Even better if a film about it can get more people interested in the science of language structure. The film’s linguist, Louise Banks, played by Amy Adams, is charged with figuring out the language of the aliens who have landed on earth. She needs to do this in order to find out what they want.

How would one go about decoding a language that nobody knows? Field linguists—those who go out into the world to analyze little-known languages—have developed techniques for doing this kind of thing. The filmmakers consulted with McGill University linguist Jessica Coon, who herself has worked in the field on native languages of Mexico and Canada.

The problem of interpreting an unfamiliar language becomes a lot harder when dealing with creatures that don’t share our human bodies or articulators, much less a common frame of reality or physical environment, but that’s no reason not to start with the basics of linguistic communication that we do have a handle on. Here are four important concepts from linguistics that help Dr. Banks do the job she needs to do in Arrival.


At one point Colonel Weber (played by Forest Whitaker) asks Dr. Banks why she’s wasting time with a list of simple words like eat and walk when their priority is to find out what the purpose of the aliens’ visit is. A good field linguist knows you can’t just jump to abstract concepts like purpose without establishing the basics first. But what are the basics?

For decades, linguists have used variations on the Swadesh list, a list of basic concepts first put together in the 1950s by linguist Morris Swadesh. They include concepts like I and you, one and many, as well as objects and actions in the observable world like person, blood, fire, eat, sleep, and walk. They were chosen to be as universal as possible, and they can be indicated by pointing or pantomime or pictures, which makes it possible to ask for their words before proper linguistic question-asking has been figured out. Though the movie’s heptapods likely don’t share most of our universal, earth-bound concepts, it’s as good a place to start as any.


It might seem that the most important question to focus on when trying to analyze an unknown language is "what does this mean?" For a linguist, however, the most important question is "what are the units?" This is not because meaning is not useful, but because, while you can have meaning without language, you cannot have language without units. A sigh is meaningful, but not linguistic. It is not composed of discrete units, but an overall feel.

The concept of discreteness is one of the basic design features of human language. Linguistic utterances are patterns of combinations of smaller, meaningless units (sounds, or in the movie’s case, parts of ink blots) that reoccur in other utterances in different combinations with different meanings. When Dr. Banks sits down to analyze the circular ink blots the heptapods have thrown out, she marks up specific parts of them. She is not viewing them as analog, holistic pictures of meaning, but as compositions of parts, and she expects those parts to occur in other ink blots.


The concept of the minimal pair is crucial for figuring out what the units of a specific language are. An English speaker will say that car, whether it's pronounced with a regular r or a rolled r, means the same thing (even if the rolled r sounds a bit strange). A Spanish speaker will say that caro means something different with a rolled r (caro "expensive" vs. carro "car"). The rolled r in English is just a different pronunciation of the same unit. In Spanish, it’s a different unit.

A minimal pair is a pair of words that differ in meaning because one sound has changed. The existence of a minimal pair shows that the differing sound is a crucial element of the language’s structure. In one scene in the movie, Dr. Banks notes that two ink blots are exactly the same except for a little hook on the end. That’s how she knows the hook does something important. With that knowledge, she can put it in the known inventory of units for heptapod, and look for it in other utterances.


The linguistic current running through the heart of the movie is a version of what’s come to be known as the Sapir-Whorf hypothesis, most simply explained as the idea that the language you speak influences the way you think. This idea is controversial, since it has been demonstrated that languages do not restrict or constrain what people are able to perceive. However, a milder version of the theory holds that language can lay down default ways of categorizing experience that are easily shaken off if required.

We see the extreme version of Sapir-Whorf played out in the way that the perceptive abilities of Dr. Banks are completely transformed by the act of her learning the heptapod language. Her conception of time is altered by language.

The origins of the Sapir-Whorf hypothesis trace back to an analysis by Benjamin Whorf of the concept of time in the Native American language Hopi. He argued that where the linguistic devices of European languages express time as a continuum from past to present to future, with time units like days, weeks, and years conceived of as objects, the Hopi language distinguishes only between the experienced and the not experienced, and does not conceive of stretches of time as objects. There are no days in Hopi, only the return of the sun.

Whorf’s analysis has been challenged by later Hopi scholars, but it is clear that the language does handle the idea of linguistic tense in a way that is difficult to grasp for speakers of European languages. Assuming that that means we live in a different reality with respect to time is taking things way too far. But who ever said the world of fiction wasn’t allowed to take things too far?

If you find the real ideas behind the movie intriguing, or just want to get more familiar with the exciting world of linguist-heroes, check out this collection of real world resources listed by Gretchen McCulloch.

Original image
Rebecca O'Connell
What's the Longest Word in the World? Here are 12 of Them, By Category
Original image
Rebecca O'Connell

Antidisestablishmentarianism, everyone’s favorite agglutinative, entered the pop culture lexicon on August 17, 1955, when Gloria Lockerman, a 12-year-old girl from Baltimore, correctly spelled it on The $64,000 Question as millions of people watched from their living rooms. At 28 letters, the word—which is defined as a 19th-century British political movement that opposes proposals for the disestablishment of the Church of England—is still regarded as the longest non-medical, non-coined, nontechnical word in the English language, yet it keeps some robust company. Here are some examples of the longest words by category.


Note the ellipses. All told, the full chemical name for the human protein titin is 189,819 letters, and takes about three-and-a-half hours to pronounce. The problem with including chemical names is that there’s essentially no limit to how long they can be. For example, naming a single strand of DNA, with its millions and millions of repeating base pairs, could eventually tab out at well over 1 billion letters.


The longest word ever to appear in literature comes from Aristophanes’ play, Assemblywomen, published in 391 BC. The Greek word tallies 171 letters, but translates to 183 in English. This mouthful refers to a fictional fricassee comprised of rotted dogfish head, wrasse, wood pigeon, and the roasted head of a dabchick, among other culinary morsels. 


At 45 letters, this is the longest word you’ll find in a major dictionary. An inflated version of silicosis, this is the full scientific name for a disease that causes inflammation in the lungs owing to the inhalation of very fine silica dust. Despite its inclusion in the dictionary, it’s generally considered superfluous, having been coined simply to claim the title of the longest English word.


The longest accepted binomial construction, at 42 letters, is a species of soldier fly native to Thailand. With a lifespan of five to eight days, it’s unlikely one has ever survived long enough to hear it pronounced correctly.


This 30-letter thyroid disorder is the longest non-coined word to appear in a major dictionary.


By virtue of having one more letter than antidisestablishmentarianism, this is the longest non-technical English word. A mash-up of five Latin roots, it refers to the act of describing something as having little or no value. While it made the cut in the Oxford English Dictionary, Merriam-Webster volumes refuse to recognize it, chalking up its existence to little more than linguistic ephemera.


At 17 characters, this is the longest accepted isogram, a word in which every letter is used only once, and refers to the underlying dermal matrix that determines the pattern formed by the whorls, arches, and ridges of our fingerprints. 


Though the more commonly accepted American English version carries only one L, both Oxford and Merriam-Webster dictionaries recognize this alternate spelling and condone its one syllable pronunciation (think “world”), making it the longest non-coined monosyllabic English word at 11 letters.


One who doesn’t indulge in excesses, especially food and drink; at 11 letters this is the longest word to use all five vowels in order exactly once.


A type of soil tiller, the longest non-coined palindromic word included in an English dictionary tallies nine letters. Detartrated, 11 letters, appears in some chemical glossaries, but is generally considered too arcane to qualify.

11. and 12. CWTCH, EUOUAE

The longest words to appear in a major dictionary comprised entirely of either vowels or consonants. A Cwtch, or crwth, is from the Welsh word for a hiding place. Euouae, a medieval musical term, is technically a mnemonic, but has been accepted as a word in itself. 

Original image
The Grammar Rules of 3 Commonly Disparaged Dialects 
Original image

Linguists are always taken aback by the overwhelmingly negative and sometimes virulently expressed reaction they get when stating something that every linguist believes (and linguists do not agree on everything!) in a rather uncomplicated way: Every dialect has a grammar.

"Every dialect has a grammar" does not mean "everything is relative, and let's throw away all the dictionaries, and no one should go to school anymore, and I should be able to wear a bath towel to a job interview if I damn well please." What it means is that all dialects, from the very fanciest to the ones held in lowest esteem, are rule-governed systems. Here are three examples from three different commonly disparaged dialects that illustrate how dialects have grammar.

1. Appalachian a-prefixing

One of the most noticeable features of Appalachian English, which has been studied extensively by the linguists Walt Wolfram and Donna Christian, is the a- prefix that attaches to verbs. When people want to mock "hick" speech, they often scatter a-prefixed words around like "a-goin'" and "a-huntin'" and "a-fishin'," but if they don't actually speak the dialect, they usually make mistakes. That is because they don't know the rules of where a-prefixing can apply, and where it can't.

Rules? Yes, rules. To someone who speaks an a-prefixing dialect this sounds right: "He was a-huntin'."

But these sound wrong:

He likes a-huntin'.
Those a-screamin' children didn't bother me.
He makes money by a-buildin' houses.

It is not the case that a-prefixes can attach to any old word ending in -ing. They can attach to verbs, as in the first example. But not to gerunds (a verb serving as a noun for a general action), adjectives, or objects of prepositions, as in the other examples. The fact that those examples sound wrong to dialect speakers shows that there are conditions on where a-prefixes can go. The fact that those conditions can be described in terms of verbs, gerunds, adjectives, and prepositions show that the conditions have to do with the linguistic structure of sentences. A condition that depends on linguistic structure is a rule. A system of these rules is a grammar. This is what linguists mean when they talk about the grammar of a dialect.

People who speak this dialect don't learn these rules from a book. They know them implicitly, even if they can't describe them, the same way you know "I gave him a dollar" sounds good but "I donated him a dollar" sounds bad (even if you've never heard of linguistic argument structure). Their use of the dialect is not whimsical and random, but governed by those rules. Someone who doesn't follow those rules, e.g., in a hamfisted attempt to mock the dialect, can be said to be speaking ungrammatical Appalachian English.

2. Southern American English "liketa"

Often features that are seen as sloppy pronunciations of Standard English show themselves on closer inspection to be used in a non-sloppy, highly consistent way—but according to a different set of rules. In the Alabama dialect studied by linguist Crawford Feagin, speakers say things like, "She liketa killed me!", meaning that she just about started to kill me, but didn't. This "liketa" is not just a shortening of "would have liked to"; it's also possible to say "I liketa had a heart attack."

"Liketa" is close to being a substitute for "almost," but it doesn't behave exactly like that word either; you can ask "did you almost die?" but not "did you liketa died?"

"Liketa" is not just a lazy version of Standard English. You can describe the conditions for its use—the rules of "liketa." As Feagin says, it "occurs in both positive and negative sentences, but not in questions and commands. It may co-occur with the intensifier 'just'; it always occurs in the past." Because rules govern "liketa," it is possible to break those rules, and if you do you can be said to be using it ungrammatically.

3. African-American English stressed "BIN"

African-American English has a number of distinguishing features, one of them being the use of "stressed BIN," described by linguist John Rickford. It carries the main stress of the sentence and is distinct from unstressed "been." It occurs in sentences like "she BIN married," which does not mean "she has been married." It means "she is married, and has been for a long time."

Stressed BIN is like a remote past tense, something that Standard English lacks a simple marker for. It can also be used in places where Standard "been" would not occur, such as "I BIN ate it" (I ate it a long time ago).

There are structural conditions on where stressed BIN can and cannot occur. Its use is governed by rules. As linguist Lisa Green points out, it can't be moved to the front of the sentence for questions (BIN John and Lisa dating?) or used in a tagged question at the end (She BIN married, binn't she?), and it can't be used with phrases indicating a specific time (I BIN asked him bout that three weeks ago). Because there are grammatical conditions for the use of stressed BIN, it is possible to use it the wrong way, as nearly everyone who tries to mock it does.

More explanations of these phenomena and others can be found at the Yale Grammatical Diversity project, the mission of which is to serve as "a crucial source of data for the development of theories of human linguistic knowledge." However you feel about dialects and whether they are worthy of respect, the fact that human ways of speaking always settle into rule-governed systems, all describable in terms of the same set of basic linguistic concepts—that, at the very least, is pretty darn interesting. And frankly, the more you pursue what's interesting about it, the less emotional your judgments about dialects become.

This post originally appeared in 2013.


More from mental floss studios