Original image

7 Fun Facts About American Names

Original image

Most Americans are given a first and last name when they're born, but aggregate data on full names is not widely distributed by any federal government agency. Instead, data on first and last names is compiled and released separately by two different agencies. The Social Security Administration (SSA) releases an annual list of first names given to babies born in the United States, while the Census bureau provides a list of last names of individuals living in the U.S. once every decade or so.

But there are some sources of information on full names. One is the Social Security Death Master File (DMF). The DMF is widely used as a death verification tool, though a fraction of a percent of the individuals are added erroneously while still alive (and not all deaths are recorded). The most recent publicly available full version is from 2013 and contains over 87 million entries. Eighty percent of the entries were born 1930 or earlier, so the group skews older. While the DMF doesn’t provide an exhaustive list, there are still a lot of very unusual full names among them. Here are seven fun facts about American names from the DMF.


There were 1560 different first and last name combinations. Thomas Thomas is by far the most frequently occurring, followed by James James. Alexander Alexander and Santiago Santiago make a good showing. The most frequently occurring female name is Rose Rose at number three. The rest of the top names are predominantly male. Of the top 25, only four are names that are overwhelmingly female: Rose Rose, Ruth Ruth, Grace Grace, and Rosa Rosa.


Excluding people with identical first and last names, there are 4344 different names where the last name starts with the first name. More than a quarter of the total occurrences are for John Johnson, followed by William Williams. Similar to Johnson and Williams, almost all the last names are patronymic. Their original meaning was to denote someone is “son of [insert father’s name].” The top 25 include patronymic last names that are English (ending in son, like Robert Robertson), Welsh (often ending in s, like Edward Edwards), Danish (ending in sen, like Jens Jensen), and Spanish (ending in ez, like Martin Martinez). Given that by definition, a patronym includes the name of the male parent, it’s unsurprising that boys’ first names dominate the top of the list. The top female name is Eva Evans at number 19, with only two more in the top 25, neither of which are patronymic (Rose Rosen and Rose Rosenberg).


Patronymic last names are not always signified by their endings. In some cases, it’s the beginning of the last name that gives it away. Such is the case with Gaelic (last names starting with Mc or Mac or O’ in Ireland for "grandson of") and Norman (start with Fitz). From a total of 2201 different first and last names where the last name ends with the first, the top four names are all patronymic. They are, in order: Donald MacDonald, Donald McDonald, Gerald Fitzgerald, and Patrick Fitzpatrick. However, the top names are not dominated by patronymic last names, including the top female name: Anna Hanna. There are many examples of this type of accidental overlap, including Avis Davis, Edith Meredith, and Milton Hamilton. It should be noted that it is possible for a last name to both end and start with a first name. And so, Rosa Rosa-Rosa is included on both lists.


Using a pronouncing dictionary, I scanned the DMF for cases where the last name rhymed with the first name. The dictionary file didn’t contain every possible name, so there may be others among the 87 million; however, the more common names do appear to be included. I uncovered 16,308 different rhyming first and last names, including Florence Lawrence, Doris Morris, and Nellie Kelley. Names like this, which might be considered more melodic, seem to be more prevalent among females. Four of the top five names are female (all with first name Mary), including the most common: Mary Perry. The most common male name is John Hogan at number 2. If you’re not sold that this is a bona fide rhyme, Paul Hall and John Hahn follow at 6 and 7, respectively. There were also 158 Ronald McDonalds on the list, though in 2014 Taco Bell managed to find a couple dozen more who are still alive.


The DMF has some very rare last names that due to minimum threshold requirements don’t make it into the aggregate U.S. Census data. This includes 43 different last names that are 16 characters or longer (last names in most recent U.S. Census data max out at 15 characters). As a native of Greece, a country notorious for long last names, I had a hunch it would be a contest between Greek and Armenian last names. I was partially right in that Aghubgharehptiannej is most likely Armenian. Everybodytalksabout is Native American and Fernandezdelaportil is Spanish in origin. I excluded names with hyphens or spaces from my search, however it does appear that all three of these may have been altered to merge previously distinct segments.

The next three longest are Persian (Amirsahansouzshani), Georgian (Dzhindzhikhashvili), and Laotian (Nanthovongdouangsy). The longest Greek name in the DMF was 17 characters (Papadimitropoulos).


Most of the people on the DMF were born before 1930, so names like Donald Duck (six occurrences), Homer Simpson (69 occurrences) or Joseph Stalin (one occurrence) may not have the same cultural significance for the parents who thought of these names. However, I located 20 names that would have raised eyebrows even a century ago. Finding peculiar last names is not something that can be accomplished via a simple algorithm, so I scanned the database for instances of remarkable names mentioned by Russell Ash, as well as a few of my own. The most popular is Mary Land (139 occurrences), but there's also Hazel Nutt, Robin Banks, Scott Free, and Pearly Gates.


Also from Russell Ash’s list, I scanned the DMF for occurrences of 16 different unfortunate first initials and last names. At the top of the rankings are 721 B. Wares and 375 B. Quicks. O. Heck, C. Below, and T. Hee all had more than 10 occurrences.

Damian Mac Con Uladh contributed research for this article. Further information and more extensive lists of results can be found in this post at Social Security Death Master File courtesy of

Original image
iStock // Ekaterina Minaeva
Man Buys Two Metric Tons of LEGO Bricks; Sorts Them Via Machine Learning
Original image
iStock // Ekaterina Minaeva

Jacques Mattheij made a small, but awesome, mistake. He went on eBay one evening and bid on a bunch of bulk LEGO brick auctions, then went to sleep. Upon waking, he discovered that he was the high bidder on many, and was now the proud owner of two tons of LEGO bricks. (This is about 4400 pounds.) He wrote, "[L]esson 1: if you win almost all bids you are bidding too high."

Mattheij had noticed that bulk, unsorted bricks sell for something like €10/kilogram, whereas sets are roughly €40/kg and rare parts go for up to €100/kg. Much of the value of the bricks is in their sorting. If he could reduce the entropy of these bins of unsorted bricks, he could make a tidy profit. While many people do this work by hand, the problem is enormous—just the kind of challenge for a computer. Mattheij writes:

There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth).

In the following months, Mattheij built a proof-of-concept sorting system using, of course, LEGO. He broke the problem down into a series of sub-problems (including "feeding LEGO reliably from a hopper is surprisingly hard," one of those facts of nature that will stymie even the best system design). After tinkering with the prototype at length, he expanded the system to a surprisingly complex system of conveyer belts (powered by a home treadmill), various pieces of cabinetry, and "copious quantities of crazy glue."

Here's a video showing the current system running at low speed:

The key part of the system was running the bricks past a camera paired with a computer running a neural net-based image classifier. That allows the computer (when sufficiently trained on brick images) to recognize bricks and thus categorize them by color, shape, or other parameters. Remember that as bricks pass by, they can be in any orientation, can be dirty, can even be stuck to other pieces. So having a flexible software system is key to recognizing—in a fraction of a second—what a given brick is, in order to sort it out. When a match is found, a jet of compressed air pops the piece off the conveyer belt and into a waiting bin.

After much experimentation, Mattheij rewrote the software (several times in fact) to accomplish a variety of basic tasks. At its core, the system takes images from a webcam and feeds them to a neural network to do the classification. Of course, the neural net needs to be "trained" by showing it lots of images, and telling it what those images represent. Mattheij's breakthrough was allowing the machine to effectively train itself, with guidance: Running pieces through allows the system to take its own photos, make a guess, and build on that guess. As long as Mattheij corrects the incorrect guesses, he ends up with a decent (and self-reinforcing) corpus of training data. As the machine continues running, it can rack up more training, allowing it to recognize a broad variety of pieces on the fly.

Here's another video, focusing on how the pieces move on conveyer belts (running at slow speed so puny humans can follow). You can also see the air jets in action:

In an email interview, Mattheij told Mental Floss that the system currently sorts LEGO bricks into more than 50 categories. It can also be run in a color-sorting mode to bin the parts across 12 color groups. (Thus at present you'd likely do a two-pass sort on the bricks: once for shape, then a separate pass for color.) He continues to refine the system, with a focus on making its recognition abilities faster. At some point down the line, he plans to make the software portion open source. You're on your own as far as building conveyer belts, bins, and so forth.

Check out Mattheij's writeup in two parts for more information. It starts with an overview of the story, followed up with a deep dive on the software. He's also tweeting about the project (among other things). And if you look around a bit, you'll find bulk LEGO brick auctions online—it's definitely a thing!

Original image
Cs California, Wikimedia Commons // CC BY-SA 3.0
How Experts Say We Should Stop a 'Zombie' Infection: Kill It With Fire
Original image
Cs California, Wikimedia Commons // CC BY-SA 3.0

Scientists are known for being pretty cautious people. But sometimes, even the most careful of us need to burn some things to the ground. Immunologists have proposed a plan to burn large swaths of parkland in an attempt to wipe out disease, as The New York Times reports. They described the problem in the journal Microbiology and Molecular Biology Reviews.

Chronic wasting disease (CWD) is a gruesome infection that’s been destroying deer and elk herds across North America. Like bovine spongiform encephalopathy (BSE, better known as mad cow disease) and Creutzfeldt-Jakob disease, CWD is caused by damaged, contagious little proteins called prions. Although it's been half a century since CWD was first discovered, scientists are still scratching their heads about how it works, how it spreads, and if, like BSE, it could someday infect humans.

Paper co-author Mark Zabel, of the Prion Research Center at Colorado State University, says animals with CWD fade away slowly at first, losing weight and starting to act kind of spacey. But "they’re not hard to pick out at the end stage," he told The New York Times. "They have a vacant stare, they have a stumbling gait, their heads are drooping, their ears are down, you can see thick saliva dripping from their mouths. It’s like a true zombie disease."

CWD has already been spotted in 24 U.S. states. Some herds are already 50 percent infected, and that number is only growing.

Prion illnesses often travel from one infected individual to another, but CWD’s expansion was so rapid that scientists began to suspect it had more than one way of finding new animals to attack.

Sure enough, it did. As it turns out, the CWD prion doesn’t go down with its host-animal ship. Infected animals shed the prion in their urine, feces, and drool. Long after the sick deer has died, others can still contract CWD from the leaves they eat and the grass in which they stand.

As if that’s not bad enough, CWD has another trick up its sleeve: spontaneous generation. That is, it doesn’t take much damage to twist a healthy prion into a zombifying pathogen. The illness just pops up.

There are some treatments, including immersing infected tissue in an ozone bath. But that won't help when the problem is literally smeared across the landscape. "You cannot treat half of the continental United States with ozone," Zabel said.

And so, to combat this many-pronged assault on our wildlife, Zabel and his colleagues are getting aggressive. They recommend a controlled burn of infected areas of national parks in Colorado and Arkansas—a pilot study to determine if fire will be enough.

"If you eliminate the plants that have prions on the surface, that would be a huge step forward," he said. "I really don’t think it’s that crazy."

[h/t The New York Times]