CLOSE

How Do Computers Understand Speech?

More and more, we can get computers to do things for us by talking to them. A computer can call your mother when you tell it to, find you a pizza place when you ask for one, or write out an email that you dictate. Sometimes the computer gets it wrong, but a lot of the time it gets it right, which is amazing when you think about what a computer has to do to turn human speech into written words: turn tiny changes in air pressure into language. Computer speech recognition is very complicated and has a long history of development, but here, condensed for you, are the 7 basic things a computer has to do to understand speech.

1. Turn the movement of air molecules into numbers.


Wikimedia Commons

Sound comes into your ear or a microphone as changes in air pressure, a continuous sound wave. The computer records a measurement of that wave at one point in time, stores it, and then measures it again. If it waits too long between measurements, it will miss important changes in the wave. To get a good approximation of a speech wave, it has to take a measurement at least 8000 times a second, but it works better if it takes one 44,100 times a second. This process is otherwise known as digitization at 8kHz or 44.1kHz.

2. Figure out which parts of the sound wave are speech.

When the computer takes measurements of air pressure changes, it doesn't know which ones are caused by speech, and which are caused by passing cars, rustling fabric, or the hum of hard drives. A variety of mathematical operations are performed on the digitized sound wave to filter out the stuff that doesn't look like what we expect from speech. We kind of know what to expect from speech, but not enough to make separating the noise out an easy task.

3. Pick out the parts of the sound wave that help tell speech sounds apart.


Wikimedia Commons

A sound wave from speech is actually a very complex mix of multiple waves coming at different frequencies. The particular frequencies—how they change, and how strongly those frequencies are coming through—matter a lot in telling the difference between, say, an "ah" sound and an "ee" sound. More mathematical operations transform the complex wave into a numerical representation of the important features.

4. Look at small chunks of the digitized sound one after the other and guess what speech sound each chunk shows.

There are about 40 speech sounds, or phonemes, in English. The computer has a general idea of what each of them should look like because it has been trained on a bunch of examples. But not only do the characteristics of these phonemes vary with different speaker accents, they change depending on the phonemes next to them—the 't' in "star" looks different than the 't' in "city." The computer must have a model of each phoneme in a bunch of different contexts for it to make a good guess.

5. Guess possible words that could be made up of those phonemes.

The computer has a big list of words that includes the different ways they can be pronounced. It makes guesses about what words are being spoken by splitting up the string of phonemes into strings of permissible words. If it sees the sequence "hang ten," it shouldn't split it into "hey, ngten!" because "ngten" won't find a good match in the dictionary.

6. Determine the most likely sequence of words based on how people actually talk.

There are no word breaks in the speech stream. The computer has to figure out where to put them by finding strings of phonemes that match valid words. There can be multiple guesses about what English words make up the speech stream, but not all of them will make good sequences of words. "What do cats like for breakfast?" could be just as good a guess as "water gaslight four brick vast?" if words are the only consideration. The computer applies models of how likely one word is to follow the next in order to determine which word string is the best guess. Some systems also take into account other information, like dependencies between words that are not next to each other. But the more information you want to use, the more processing power you need.

7. Take action

Once the computer has decided which guesses to go with, it can take action. In the case of dictation software, it will print the guess to the screen. In the case of a customer service phone line, it will try to match the guess to one of its pre-set menu items. In the case of Siri, it will make a call, look up something on the Internet, or try to come up with an answer to match the guess. As anyone who has used speech recognition software knows, mistakes happen. All the complicated statistics and mathematical transformations might not prevent "recognize speech" from coming out as "wreck a nice beach," but for a computer to pluck either one of those phrases out of the air is still pretty incredible.

nextArticle.image_alt|e
iStock
arrow
science
There May Be an Ancient Reason Why Your Dog Eats Poop
iStock
iStock

Dogs aren't known for their picky taste in food, but some pups go beyond the normal trash hunting and start rooting around in poop, whether it be their own or a friend's. Just why dogs exhibit this behavior is a scientific mystery. Only some dogs do it, and researchers aren't quite sure where the impulse comes from. But if your dog is a poop eater, it's nearly impossible to steer them away from their favorite feces.

A new study in the journal Veterinary Medicine and Science, spotted by The Washington Post, presents a new theory for what scientists call "canine conspecific coprophagy," or dogs eating dog poop.

In online surveys about domestic dogs' poop-eating habits completed by thousands of pet owners, the researchers found no link between eating poop and a dog's sex, house training, compulsive behavior, or the style of mothering they received as puppies. However, they did find one common link between the poop eaters. Most tended to eat only poop that was less than two days old. According to their data, 85 percent of poop-eaters only go for the fresh stuff.

That timeline is important because it tracks with the lifespan of parasites. And this led the researchers to the following hypothesis: that eating poop is a holdover behavior from domestic dogs' ancestors, who may have had a decent reason to tuck into their friends' poop.

Since their poop has a high chance of containing intestinal parasites, wolves poop far from their dens. But if a sick wolf doesn't quite make it out of the den in time, they might do their business too close to home. A healthier wolf might eat this poop, but the parasite eggs wouldn't have hatched within the first day or two of the feces being dropped. Thus, the healthy wolf would carry the risk of infection away from the den, depositing the eggs they had consumed away in their own, subsequent bowel movements at an appropriate distance before the eggs had the chance to hatch into larvae and transmit the parasite to the pack.

Domestic dogs may just be enacting this behavior instinctively—only for them, there isn't as much danger of them picking up a parasite at home. However, the theory isn't foolproof. The surveys also found that so-called "greedy eaters" were more likely to eat feces than dogs who aren't quite so intense about food. So yes, it could still be about a poop-loving palate.

But really, it's much more pleasant to think about the behavior as a parasite-protection measure than our best pals foraging for a delicious fecal snack. 

[h/t The Washington Post]

nextArticle.image_alt|e
iStock
arrow
science
The Prehistoric Bacteria That Helped Create Our Cells Billions of Years Ago
iStock
iStock

We owe the existence of our cells—the very building blocks of life—to a chance relationship between bacteria that occurred more than 2 billion years ago. Flash back to Bio 101, and you might remember that humans, plants, and animals have complex eukaryotic cells, with nucleus-bound DNA, instead of single-celled prokaryotic cells. These contain specialized organelles such as the mitochondria—the cell’s powerhouse—and the chloroplast, which converts sunlight into sugar in plants.

Mitochondria and chloroplasts both look and behave a lot like bacteria, and they also share similar genes. This isn’t a coincidence: Scientists believe these specialized cell subunits are descendants of free-living prehistoric bacteria that somehow merged together to form one. Over time, they became part of our basic biological units—and you can learn how by watching PBS Eons’s latest video below.

SECTIONS

arrow
LIVE SMARTER
More from mental floss studios