CLOSE
Original image
istock

The World’s Top 20 Languages—And The Words English Has Borrowed From Them

Original image
istock

English is known as a magpie language that picks up words from almost every other language and culture it comes in contact with, from Abenaki to Zulu. And although some languages have understandably widened the English vocabulary more than others, modern English dictionaries contain more of a geographical melting pot than ever before. 

Listed here—in order by number of native speakers—are the world’s top 20 languages (according to Ethnologue, a global catalog of the 7000 languages currently in use worldwide). Alongside each entry on the list are just some of the words which English has borrowed from it. 

1. CHINESE: 1197 million native speakers (MANDARIN: 848 million)

Linguistically speaking, Chinese is a “macrolanguage” that encompasses dozens of different forms and dialects that together have just short of 1.2 billion native speakers. By far the most widely spoken variety of Chinese, however, is Mandarin, with 848 million speakers alone—or roughly 70 percent of China’s entire population. According to the Oxford English Dictionary, Chinese words have been recorded in English since the mid-16th century, with the earliest examples including the likes of tai chi (1736), ginseng (1634), yin and yang (1671), kumquat (1699) and feng shui (1797). One of the earliest of all is lychee (1588). 

2. SPANISH: 399 million

One quarter of the world’s 399 million Spanish speakers live in Mexico, although other important Hispanophone countries include Colombia (41 million), Argentina (38.8 million), and Venezuela (26.3 million); there are almost as many native Spanish speakers in the United States (34.2 million) as there are in Spain (38.4 million). In English, Spanish loanwords are characterized by terms from weaponry and the military (guerrilla, flotilla, armada, machete), animal names (chinchilla, alligator, cockroach, iguana), and terms from food and drink (potato, banana, anchovy, vanilla).

3. ENGLISH: 335 million

According to Ethnologue, the English language’s 335 million native speakers include 225 million in the United States, 55 million in the United Kingdom, 19 million in Canada, 15 million in Australia, and just short of 4 million in New Zealand. But English is one of the world’s most widespread languages: mother-tongue speakers are recorded in 101 different countries and territories worldwide, 94 of which class it as an official language. Moreover, if the number of people who use English as a second language or lingua franca were included, the global total of English speakers would easily rise to over one billion. 

4. HINDI: 260 million

The world’s 260 million native Hindi speakers are mainly found in India and Nepal, while an estimated 120 million more people in India use Hindi as a second language. As with all Indian languages, a great many Hindi loanwords found in English were adopted during the British Raj in the 19th and early 20th centuries, but long before then the likes of rupee (1612), guru (1613), pilau (1609), pukka (1619), myna (1620) and juggernaut (1638) had already begun to appear in English texts. 

5. ARABIC: 242 million

Like Chinese, Arabic is technically another macrolanguage whose 242 million native speakers—spread across 60 different countries worldwide—use a range of different forms and varieties. The first Arabic loanwords in English date from the 14th century, although many of the earliest examples are fairly rare and obsolete words like alkanet (a type of dye, 1343) and hardun (an Egyptian agama lizard, 1398). Among the more familiar Arabic contributions to English are hashish (1598), sheikh (1577), and kebab (1698).

6. PORTUGUESE: 203 million

The population of Portugal is just under 11 million, but the global Lusophone population is boosted enormously by Brazil’s 187 million native speakers. Etymologically, Portuguese and Spanish loanwords are often tricky to differentiate because of the similarities between the two languages, but according to the OED, Portuguese is responsible for the likes of marmalade (1480), pagoda (1582), commando (1791), cuspidor (1779), and piranha (1710). 

7. BENGALI: 189 million

After Hindi, Bengali is the second most widely spoken language of India with just over 82 million native speakers. But the largest native Bengali population in the world is found in Bangladesh, where 106 million people use it as their first language. The number of Bengali words adopted into English, however, is relatively small, with only 47 instances—including jute (1746), almirah (a free-standing cupboard, 1788), and jampan (a type of sedan chair, 1828)—recorded in the OED. 

8. RUSSIAN: 166 million

One hundred and thirty-seven million of Russian’s 166 million native speakers live in the Russian Federation, with smaller populations in Ukraine (8.3 million), Belarus (6.6 million), Uzbekistan (4 million) and Kazakhstan (3.8 million). The earliest Russian loanwords began to appear in English in the 16th century, among them czar or tsar (1555), rouble (1557), and beluga (1591).

9. JAPANESE: 128 million

Japan’s 128 million people comprise the language’s entire native speaker population, enough to make it the ninth most widely spoken language in the world. Japanese words have been appearing in English texts since the 16th century, with some of the earliest loanwords including katana and wacadash (both types of samurai sword, 1613), miso (1615), shogun (1615), and sake (1687). 

10. LAHNDA: 88.7 million

Lahnda is the collective name given to a group of related Punjabi languages and dialects spoken predominantly in Pakistan. Punjabi words adopted into English are rare, but nevertheless include bhangra (a local traditional dance form and music style, 1965), and gurdwara (a Sikh temple, 1909). 

11. JAVANESE: 84.3 million

Java is the most populous island on Earth, home to almost two-thirds of the entire population of Indonesia. More than half of its 139 million inhabitants speak the local Javanese language, enough to earn it a spot just outside of the global top 10 here. The words batik (1880), gamelan (1816) and lahar (a volcanic mudflow, 1929) are all of Javanese origin. 

12. GERMAN: 78.1 million

Seventy million of the world’s 78 million native German speakers live in Germany, with the remaining 8 million found in the likes of Austria, Switzerland, Belgium and Luxembourg. As English itself is classed as a Germanic language, historically the two languages share a close relationship and ultimately many of the oldest English words could be argued to have German roots. More recent direct German loanwords, however, include sauerkraut (1633), pumpernickel (1738), doppelgänger (1851), and frankfurter (1894). 

13. KOREAN: 77.2 million

Korean loanwords in English are relatively rare, with none at all recorded by the OED before the 19th century. Among the most familiar are kimchi (1898) and taekwondo (1967), while rarer examples include kono (a traditional Korean board game, 1895), and kisaeng (the Korean equivalent of a Japanese geisha girl, 1895). 

14. FRENCH: 75.9 million

The world’s 75 million native French speakers are divided among 51 countries and territories, including 7.3 million in Canada, 4 million in Belgium, and 6 million in the Democratic Republic of the Congo (home to the second largest French-speaking population in the world). Thanks largely to the Norman Conquest, roughly three out of every 10 English words are thought to have French roots, and the trend has continued ever since: English has adopted more loanwords directly from French—absinthe, blancmange, concierge, dauphin, envoi, fête, gourmand, hollandaise, impasse—than from any other living language. 

15. AND 16. TELUGU: 74 MILLION AND MARATHI: 71.8 MILLION

Telugu and Marathi are India’s third and fourth most used languages, with just over 74 and just short of 72 million native speakers, respectively. Neither is responsible for a great many English loanwords, however, and the vast majority of those that have found their way into the language tend to be fairly rare and unfamiliar, like desai (a revenue office or a petty thief, from Marathi, 1698), chawl (an Indian lodging house, from Marathi, 1891), and podu (an area of jungle cleared for farming, from Telugu 1938). By far the most well known is bandicoot, which is thought to literally mean “pig-rat” in Telugu. 

17. TURKISH: 70.9 million

Sixty-six million of the world’s 70 million Turkish speakers are in Turkey, with smaller populations found in Greece, Bulgaria, Romania, Cyprus, and Kazakhstan. Turkish words in English date back to the 16th century, with vizier (1562), tulip (1578) and caftan (1591) being among the earliest to arrive.

18. TAMIL: 68.8 million

Tamil is India’s fifth most spoken language, as well as being one of the official languages of Sri Lanka and Singapore. Catamaran (1697), pariah (1613), poppadum (1820) and patchouli (1843) are all Tamil words, as is curry (1598). 

19. VIETNAMESE: 67.8 million

The OED records just 14 Vietnamese loanwords in English, the earliest of which is the name of the Vietnamese currency, dông (1824). Among the handful of others is pho (a traditional Vietnamese soup, 1935), ao dai (a woman’s high-necked tunic, 1961), and both hao and xu (1968), the names for one-tenth and one-hundredth of a dông, respectively. 

20. URDU: 64 million

Urdu is the sixth Indian language to make the global top 20, with its worldwide total comprised of 51 million native Indian speakers, a further 10 million in Pakistan, and smaller populations in Nepal and Mauritius. Urdu words have been adopted into English since the fifteenth century, with surprisingly early examples including mogul (1577), cummerbund (1613), and bungalow (1676). Earliest of all, however, is shrab—an old Anglo-Indian nickname for an alcoholic beverage, the first record of which in English dates from 1477. 

Original image
iStock
arrow
language
Beyond “Buffalo buffalo”: 9 Other Repetitive Sentences From Around The World
Original image
iStock

Famously, in English, it’s possible to form a perfectly grammatical sentence by repeating the word buffalo (and every so often the place name Buffalo) a total of eight times: Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo essentially means “buffalo from Buffalo, New York, who intimidate other buffalo from Buffalo, New York, are themselves intimidated by buffalo from Buffalo, New York.” But repetitive or so-called antanaclastic sentences and tongue twisters like these are by no means unique to English—here are a few in other languages that you might want to try.

1. “LE VER VERT VA VERS LE VERRE VERT” // FRENCH

This sentence works less well in print than Buffalo buffalo, of course, but it’s all but impenetrable when read aloud. In French, le ver vert va vers le verre vert means “the green worm goes towards the green glass,” but the words ver (worm), vert (green), vers (towards), and verre (glass) are all homophones pronounced “vair,” with a vowel similar to the E in “bet” or “pet.” In fact, work the French heraldic word for squirrel fur, vair, in there somewhere and you’d have five completely different interpretations of the same sound to deal with.

2. “CUM EO EO EO EO QUOD EUM AMO” // LATIN

Eo can be interpreted as a verb (“I go”), an adverb ("there," "for that reason"), and an ablative pronoun (“with him” or “by him”) in Latin, each with an array of different shades of meaning. Put four of them in a row in the context cum eo eo eo eo quod eum amo, and you’ll have a sentence meaning “I am going there with him because I love him.”

3. “MALO MALO MALO MALO” // LATIN

An even more confusing Latin sentence is malo malo malo malo. On its own, malo can be a verb (meaning “I prefer,” or “I would rather”); an ablative form of the Latin word for an apple tree, malus (meaning “in an apple tree”); and two entirely different forms (essentially meaning “a bad man,” and “in trouble” or “in adversity”) of the adjective malus, meaning evil or wicked. Although the lengths of the vowels differ slightly when read aloud, put all that together and malo malo malo malo could be interpreted as “I would rather be in an apple tree than a wicked man in adversity.” (Given that the noun malus can also be used to mean “the mast of a ship,” however, this sentence could just as easily be interpreted as, “I would rather be a wicked man in an apple tree than a ship’s mast.”)

4. “FAR, FÅR FÅR FÅR?” // DANISH

Far (pronounced “fah”) is the Danish word for father, while får (pronounced like “for”) can be used both as a noun meaning "sheep" and as a form of the Danish verb , meaning "to have." Far får får får? ultimately means “father, do sheep have sheep?”—to which the reply could come, får får ikke får, får får lam, meaning “sheep do not have sheep, sheep have lambs.”

5. “EEEE EE EE” // MANX

Manx is the Celtic-origin language of the Isle of Man, which has close ties to Irish. In Manx, ee is both a pronoun (“she” or “it”) and a verb (“to eat”), a future tense form of which is eeee (“will eat”). Eight letter Es in a row ultimately can be divided up to mean “she will eat it.”

6. “COMO COMO? COMO COMO COMO COMO!” // SPANISH

Como can be a preposition (“like,” “such as”), an adverb (“as,” “how”), a conjunction (“as”), and a verb (a form of comer, “to eat”) in Spanish, which makes it possible to string together dialogues like this: Como como? Como como como como! Which means “How do I eat? I eat like I eat!”

7. “Á Á A Á Á Á Á.” // ICELANDIC

Á is the Icelandic word for river; a form of the Icelandic word for ewe, ær; a preposition essentially meaning “on” or “in;” and a derivative of the Icelandic verb eiga, meaning “to have,” or “to possess.” Should a person named River be standing beside a river and simultaneously own a sheep standing in or at the same river, then that situation could theoretically be described using the sentence Á á á á á á á in Icelandic.

8. “MAI MAI MAI MAI MAI” // THAI

Thai is a tonal language that uses five different tones or patterns of pronunciation (rising, falling, high, low, and mid or flat) to differentiate between the meanings of otherwise seemingly identical syllables and words: glai, for instance, can mean both “near” and “far” in Thai, just depending on what tone pattern it’s given. Likewise, the Thai equivalent of the sentence “new wood doesn’t burn, does it?” is mai mai mai mai mai—which might seem identical written down, but each syllable would be given a different tone when read aloud.

9. “THE LION-EATING POET IN THE STONE DEN” // MANDARIN CHINESE

Mandarin Chinese is another tonal language, the nuances of which were taken to an extreme level by Yuen Ren Chao, a Chinese-born American linguist and writer renowned for composing a bizarre poem entitled "The Lion-Eating Poet in the Stone Den." When written in its original Classical Chinese script, the poem appears as a string of different characters. But when transliterated into the Roman alphabet, every one of those characters is nothing more than the syllable shi:

Shíshì shīshì Shī Shì, shì shī, shì shí shí shī.
Shì shíshí shì shì shì shī.
Shí shí, shì shí shī shì shì.
Shì shí, shì Shī Shì shì shì.
Shì shì shì shí shī, shì shǐ shì, shǐ shì shí shī shìshì.
Shì shí shì shí shī shī, shì shíshì.
Shíshì shī, Shì shǐ shì shì shíshì.
Shíshì shì, Shì shǐ shì shí shì shí shī.
Shí shí, shǐ shí shì shí shī shī, shí shí shí shī shī.
Shì shì shì shì.

The only difference between each syllable is its intonation, which can be either flat (shī), rising (shí), falling (shì) or falling and rising (shǐ); you can hear the entire poem being read aloud here, along with its English translation.

Original image
iStock
arrow
Words
'Froyo,' 'Troll,' and 'Sriracha' Added to the Merriam-Webster Dictionary
Original image
iStock

Looking for the right word to describe the time you spend drinking before heading out to a party, or a faster way to say “frozen yogurt?" Merriam-Webster is here to help. The 189-year-old English vocabulary giant has just added 250 new words and definitions to their online dictionary, including pregame and froyo.

New words come and go quickly, and it’s Merriam-Webster’s job to keep tabs on the terms that have staying power. “As always, the expansion of the dictionary mirrors the expansion of the language, and reaches into all the various cubbies and corners of the lexicon,” they wrote in their announcement.

Froyo is just one of the recent additions to come from the culinary world. Bibimbap, a Korean rice dish; choux pastry, a type of dough; and sriracha, a Thai chili sauce that’s been around for decades but has just recently exploded in the U.S., are now all listed on Merriam-Webster's website.

Of course, the internet was once again a major contributor to this most recent batch of words. Some new terms, like ransomware (“malware that requires the victim to pay a ransom to access encrypted files”) come from the tech world, while words like troll ("to harass, criticize, or antagonize [someone] especially by provocatively disparaging or mocking public statements, postings, or acts”) were born on social media. Then there’s the Internet of Things, a concept that shifts the web off our phones and computers and into our appliances.

Hive mind, dog whistle, and working memory are just a few of the new entries to receive the Merriam-Webster stamp of approval. To learn more about how some words make it into the dictionary while others get left out, check these behind-the-scenes secrets of dictionary editors.

SECTIONS

arrow
LIVE SMARTER
More from mental floss studios