In the Beginning: It's almost almost here!

Reference Books

Get ready for some serious volume control as we introduce you to the legends behind the Encyclopedia, the Dictionary, and the Thesaurus.

The Encyclopedia

club_nbs_latin_books.jpg"Encyclopedia" is a strange word. It goes back to Greek for "the things of boys/children in a circle," which makes about as much sense to us as it does to you. The first encyclopedia actually predates the word itself "“ it was written in 1270 B.C.E. in Syria "“ and the concept seems to have also occurred to the Romans; Pliny the Elder was renowned for attempting to compress all the scientific knowledge of his time into 37 volumes. Much later, in 1408 during the Ming Dynasty in China, the Emperor Yongle oversaw the writing of one of the largest encyclopedias in history, at 11,000 volumes and 370 million characters, all handwritten. (Only about 400 of the volumes still exist.) But the idea of an encyclopedia doesn't seem to have translated to English until 1768, when three Scots started setting down the first edition of the Encyclopaedia Britannica, one pamphlet at a time. It's not even remotely recognizable as the encyclopedia we have today. Horse diseases got 39 pages, and the editors also made a point of calculating the number of species on Noah's ark (177). The word "woman," on the other hand, got just four words: "the female of man."

Samuel Johnson and the Dictionary of the English Language

When Johnson published his seminal work in 1755, it wasn't just a dictionary "“ it was the dictionary, and it pretty much held on to the title until well over 100 years later, when the Oxford English Dictionary finally overtook it (see below). A few English dictionaries had been published before, but none was nearly as comprehensive. Nor did any use quotations to illustrate how the words should be used. Of course, during the nine long years he spent writing it, Johnson didn't necessarily have any way of knowing how important his dictionary would turn out to be. Certainly, no one else seemed to know either. The only patronage Johnson could get for the book was the measly sum of ten pounds from one Lord Chesterfield, who realized his mistake only when he saw early drafts of the finished work. Trying to make up for slighting Johnson (and perhaps also trying to get future editions of the book dedicated to himself, as they would have been had he supplied more money in the first place), Chesterfield wrote several glowing reviews of the dictionary in popular magazines of the time. Johnson was not amused and wrote the Lord a nasty note, containing several chestnuts includ-ing the famous line, "Is not a Patron, my Lord, one who looks with unconcern on a Man struggling for Life in the water and when he has reached ground encumbers him with help?" Zing!

James Murray, W. C. Minor, and the Oxford English Dictionary

Over the 100 years after the publication of Johnson's dictionary, the English language changed quite a bit. So in 1857 the London Philological Society decided it was high time for a new dictionary and set out on a grand quest to put one together, using mailed-in contributions from thousands of learned men and perhaps a few pseudonymous women. (We wonder, does this make the Oxford English Dictionary the world's first wiki?) The project had a few false starts: one editor died a year into it, and another spent well over a
decade preparing for what was supposed to be a ten-year project in the first place. Finally, the lexicographer James Murray took over in 1879, a commitment that would occupy the rest of his time on Earth, and then some. By 1884, Murray and his team of volunteers had gotten as far down their list as ant. The final result wouldn't be published until 1928, 13 years after Murray's death. The dictionary's other main contributor, W. C. Minor, didn't live to see it completed either. Most of Minor's contributions were sent in from an asylum in England; a veteran with evidence of battle trauma, he had been confined there after shooting and killing a man in 1872. Minor was later diagnosed with schizophrenia. In the years to come his condition deteriorated so badly that he cut off his own penis. He died, impoverished and hospital-bound, in 1920.

Noah Webster and an American Dictionary of the English Language

If you're impressed by Sam Johnson's nine years of slaving on his dictionary, you'll be blown away by Webster, who started his at age 43 and finished in 1828 at 70. As a young Yale grad and member of the bar, Webster grew disinterested in practicing law, so he moved to teaching. While in the classroom though, he noticed a dearth of quality textbooks, so he wrote the iconic "blue-backed speller," a basic textbook used in classrooms for decades. In fact, the book has never been out of print since and estimated sales are as high as 100,000,000 copies! He's also responsible for founding New York City's first daily newspaper in 1893, American Minerva. In fact, Webster's editorials in the Minerva got quite a reaction: he was called "a pusillanimous, half-begotten, self-dubbed patriot," "an incurable lunatic," "a toad in the service of sans-cullottism," "a prostitute wretch," "a great fool, and a barefaced liar," "a spiteful viper," and "a maniacal pedant." And yet when he died he was considered an American hero, partly because his dictionary wasn't just supremely useful "“ it was an ex-pression of national pride. Webster's the guy you have to thank for a number of linguistic differences between Americans and the British: "color" instead of "colour," for instance. Basically, Webster was a man on a mission. Noticing that Americans were developing lots of regional tics and dialects, he wanted to make sure everyone sounded at the very least like they were speaking the same language. More important, though, he didn't want people sounding like the Brits.

Roget's Thesaurus

Peter Mark Roget had a good bit of experience with reference books by the time he decided to write the world's best-known thesaurus: a physician, he was one of the Encyclopedia Britannica's major contributors on medical topics. And he'd been compiling a list of words for half a century. So in 1852, he released Roget's Thesaurus of English Words and Phrases, Classified and Arranged so as to Facilitate the Expression of Ideas and Assist in Literary Composition. Organized by categories instead of alphabetically "“ and lacking many of the features that had appeared in the 40-odd thesauruses (thesauri?) published before then "“ the work mystified most people at first. That is until they realized it could instantly make them sound smarter. By the time Roget died, he had personally overseen 25 re-printings. The thesaurus would continue to be updated many, many times after that, often serving as a lens for the culture of its times. Time magazine, in 1930, reported that a Jewish advocacy group had "flayed the Crowell company for perpetrating Roget's shameful connotations of the word Jew: cunning, usurer, rich, extortioner, heretic, deceiver, impostor, harpy, schemer, lick- penny, pinchfist, Shylock, chicanery, duplicity, crafty."
The word "Jew" was soon deleted, expunged, erased, edited out, and removed from the book.

Man Buys Two Metric Tons of LEGO Bricks; Sorts Them Via Machine Learning
May 21, 2017
Jacques Mattheij made a small, but awesome, mistake. He went on eBay one evening and bid on a bunch of bulk LEGO brick auctions, then went to sleep. Upon waking, he discovered that he was the high bidder on many, and was now the proud owner of two tons of LEGO bricks. (This is about 4400 pounds.) He wrote, "[L]esson 1: if you win almost all bids you are bidding too high."

Mattheij had noticed that bulk, unsorted bricks sell for something like €10/kilogram, whereas sets are roughly €40/kg and rare parts go for up to €100/kg. Much of the value of the bricks is in their sorting. If he could reduce the entropy of these bins of unsorted bricks, he could make a tidy profit. While many people do this work by hand, the problem is enormous—just the kind of challenge for a computer. Mattheij writes:

There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth).

In the following months, Mattheij built a proof-of-concept sorting system using, of course, LEGO. He broke the problem down into a series of sub-problems (including "feeding LEGO reliably from a hopper is surprisingly hard," one of those facts of nature that will stymie even the best system design). After tinkering with the prototype at length, he expanded the system to a surprisingly complex system of conveyer belts (powered by a home treadmill), various pieces of cabinetry, and "copious quantities of crazy glue."

Here's a video showing the current system running at low speed:

The key part of the system was running the bricks past a camera paired with a computer running a neural net-based image classifier. That allows the computer (when sufficiently trained on brick images) to recognize bricks and thus categorize them by color, shape, or other parameters. Remember that as bricks pass by, they can be in any orientation, can be dirty, can even be stuck to other pieces. So having a flexible software system is key to recognizing—in a fraction of a second—what a given brick is, in order to sort it out. When a match is found, a jet of compressed air pops the piece off the conveyer belt and into a waiting bin.

After much experimentation, Mattheij rewrote the software (several times in fact) to accomplish a variety of basic tasks. At its core, the system takes images from a webcam and feeds them to a neural network to do the classification. Of course, the neural net needs to be "trained" by showing it lots of images, and telling it what those images represent. Mattheij's breakthrough was allowing the machine to effectively train itself, with guidance: Running pieces through allows the system to take its own photos, make a guess, and build on that guess. As long as Mattheij corrects the incorrect guesses, he ends up with a decent (and self-reinforcing) corpus of training data. As the machine continues running, it can rack up more training, allowing it to recognize a broad variety of pieces on the fly.

Here's another video, focusing on how the pieces move on conveyer belts (running at slow speed so puny humans can follow). You can also see the air jets in action:

In an email interview, Mattheij told Mental Floss that the system currently sorts LEGO bricks into more than 50 categories. It can also be run in a color-sorting mode to bin the parts across 12 color groups. (Thus at present you'd likely do a two-pass sort on the bricks: once for shape, then a separate pass for color.) He continues to refine the system, with a focus on making its recognition abilities faster. At some point down the line, he plans to make the software portion open source. You're on your own as far as building conveyer belts, bins, and so forth.

Check out Mattheij's writeup in two parts for more information. It starts with an overview of the story, followed up with a deep dive on the software. He's also tweeting about the project (among other things). And if you look around a bit, you'll find bulk LEGO brick auctions online—it's definitely a thing!

What Happened to Jamie and Aurelia From Love Actually?
May 26, 2017
Fans of the romantic-comedy Love Actually recently got a bonus reunion in the form of Red Nose Day Actually, a short charity special that gave audiences a peek at where their favorite characters ended up almost 15 years later.

One of the most improbable pairings from the original film was between Jamie (Colin Firth) and Aurelia (Lúcia Moniz), who fell in love despite almost no shared vocabulary. Jamie is English, and Aurelia is Portuguese, and they know just enough of each other’s native tongues for Jamie to propose and Aurelia to accept.

A decade and a half on, they have both improved their knowledge of each other’s languages—if not perfectly, in Jamie’s case. But apparently, their love is much stronger than his grasp on Portuguese grammar, because they’ve got three bilingual kids and another on the way. (And still enjoy having important romantic moments in the car.)

In 2015, Love Actually script editor Emma Freud revealed via Twitter what happened between Karen and Harry (Emma Thompson and Alan Rickman, who passed away last year). Most of the other couples get happy endings in the short—even if Hugh Grant's character hasn't gotten any better at dancing.

