CLOSE
Original image

Inside the Netflix Recommendation Engine

Original image

Netflix makes a business out of getting subscribers to add tons of DVDs to a list of discs that will later be mailed out. Theoretically, the more discs in that list, the longer that subscriber will remain with the service, since new movies will just keep coming. So a big part of Netflix's business is recommending titles to subscribers based on what they've previously enjoyed. Netflix calls its recommendation system "Cinematch™."

In October 2006, Netflix announced The Netflix Prize, a $1 million cash award to anyone who could improve Cinematch™'s recommendation accuracy by 10%. What this "recommendation accuracy" bit means is: the system needs to get 10% better at predicting what a given user will think about a given movie, based on that user's prior movie preferences. Netflix asks users on its site to rank the movies it recommends (on a scale of 1 to 5 stars), and thus is able to mine this kind of data from daily usage.

Two weeks ago, The New York Times ran a fantastic article on Cinematch™ and The Netflix Prize. The Times profiled various programmers who are trying to improve the recommendation system's accuracy. Here's a snippet:

Each time he or his kids think of a new approach, [Len] Bertoni writes a computer program to test it. Each new algorithm takes on average three or four hours to churn through the data on the family's "quad core" Gateway computer. Bertoni's results have gradually improved. When I last spoke to him, he was at No. 8 on the leader board; his program was 8.8 percent better than Cinematch. The top team was at 9.44 percent. Bertoni said he thought he was within striking distance of victory.

But his progress had slowed to a crawl. The more Bertoni improved upon Netflix, the harder it became to move his number forward. This wasn't just his problem, though; the other competitors say that their progress is stalling, too, as they edge toward 10 percent. Why?

Bertoni says it's partly because of "Napoleon Dynamite," an indie comedy from 2004 that achieved cult status and went on to become extremely popular on Netflix. It is, Bertoni and others have discovered, maddeningly hard to determine how much people will like it. ...

Read the rest (and be sure to watch the accompanying video) for a surprisingly technical, but very readable, look into the technology behind recommendations.

Original image
iStock // Ekaterina Minaeva
arrow
technology
Man Buys Two Metric Tons of LEGO Bricks; Sorts Them Via Machine Learning
Original image
iStock // Ekaterina Minaeva

Jacques Mattheij made a small, but awesome, mistake. He went on eBay one evening and bid on a bunch of bulk LEGO brick auctions, then went to sleep. Upon waking, he discovered that he was the high bidder on many, and was now the proud owner of two tons of LEGO bricks. (This is about 4400 pounds.) He wrote, "[L]esson 1: if you win almost all bids you are bidding too high."

Mattheij had noticed that bulk, unsorted bricks sell for something like €10/kilogram, whereas sets are roughly €40/kg and rare parts go for up to €100/kg. Much of the value of the bricks is in their sorting. If he could reduce the entropy of these bins of unsorted bricks, he could make a tidy profit. While many people do this work by hand, the problem is enormous—just the kind of challenge for a computer. Mattheij writes:

There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth).

In the following months, Mattheij built a proof-of-concept sorting system using, of course, LEGO. He broke the problem down into a series of sub-problems (including "feeding LEGO reliably from a hopper is surprisingly hard," one of those facts of nature that will stymie even the best system design). After tinkering with the prototype at length, he expanded the system to a surprisingly complex system of conveyer belts (powered by a home treadmill), various pieces of cabinetry, and "copious quantities of crazy glue."

Here's a video showing the current system running at low speed:

The key part of the system was running the bricks past a camera paired with a computer running a neural net-based image classifier. That allows the computer (when sufficiently trained on brick images) to recognize bricks and thus categorize them by color, shape, or other parameters. Remember that as bricks pass by, they can be in any orientation, can be dirty, can even be stuck to other pieces. So having a flexible software system is key to recognizing—in a fraction of a second—what a given brick is, in order to sort it out. When a match is found, a jet of compressed air pops the piece off the conveyer belt and into a waiting bin.

After much experimentation, Mattheij rewrote the software (several times in fact) to accomplish a variety of basic tasks. At its core, the system takes images from a webcam and feeds them to a neural network to do the classification. Of course, the neural net needs to be "trained" by showing it lots of images, and telling it what those images represent. Mattheij's breakthrough was allowing the machine to effectively train itself, with guidance: Running pieces through allows the system to take its own photos, make a guess, and build on that guess. As long as Mattheij corrects the incorrect guesses, he ends up with a decent (and self-reinforcing) corpus of training data. As the machine continues running, it can rack up more training, allowing it to recognize a broad variety of pieces on the fly.

Here's another video, focusing on how the pieces move on conveyer belts (running at slow speed so puny humans can follow). You can also see the air jets in action:

In an email interview, Mattheij told Mental Floss that the system currently sorts LEGO bricks into more than 50 categories. It can also be run in a color-sorting mode to bin the parts across 12 color groups. (Thus at present you'd likely do a two-pass sort on the bricks: once for shape, then a separate pass for color.) He continues to refine the system, with a focus on making its recognition abilities faster. At some point down the line, he plans to make the software portion open source. You're on your own as far as building conveyer belts, bins, and so forth.

Check out Mattheij's writeup in two parts for more information. It starts with an overview of the story, followed up with a deep dive on the software. He's also tweeting about the project (among other things). And if you look around a bit, you'll find bulk LEGO brick auctions online—it's definitely a thing!

Original image
iStock
arrow
Health
200 Health Experts Call for Ban on Two Antibacterial Chemicals
Original image
iStock

In September 2016, the U.S. Food and Drug Administration (FDA) issued a ban on antibacterial soap and body wash. But a large collective of scientists and medical professionals says the agency should have done more to stop the spread of harmful chemicals into our bodies and environment, most notably the antimicrobials triclosan and triclocarban. They published their recommendations in the journal Environmental Health Perspectives.

The 2016 report from the FDA concluded that 19 of the most commonly used antimicrobial ingredients are no more effective than ordinary soap and water, and forbade their use in soap and body wash.

"Customers may think added antimicrobials are a way to reduce infections, but in most products there is no evidence that they do," Ted Schettler, science director of the Science and Environmental Health Network, said in a statement.

Studies have shown that these chemicals may actually do more harm than good. They don't keep us from getting sick, but they can contribute to the development of antibiotic-resistant bacteria, also known as superbugs. Triclosan and triclocarban can also damage our hormones and immune systems.

And while they may no longer be appearing on our bathroom sinks or shower shelves, they're still all around us. They've leached into the environment from years of use. They're also still being added to a staggering array of consumer products, as companies create "antibacterial" clothing, toys, yoga mats, paint, food storage containers, electronics, doorknobs, and countertops.

The authors of the new consensus statement say it's time for that to stop.

"We must develop better alternatives and prevent unneeded exposures to antimicrobial chemicals," Rolf Haden of the University of Arizona said in the statement. Haden researches where mass-produced chemicals wind up in the environment.

The statement notes that many manufacturers have simply replaced the banned chemicals with others. "I was happy that the FDA finally acted to remove these chemicals from soaps," said Arlene Blum, executive director of the Green Science Policy Institute. "But I was dismayed to discover at my local drugstore that most products now contain substitutes that may be worse."

Blum, Haden, Schettler, and their colleagues "urge scientists, governments, chemical and product manufacturers, purchasing organizations, retailers, and consumers" to avoid antimicrobial chemicals outside of medical settings. "Where antimicrobials are necessary," they write, we should "use safer alternatives that are not persistent and pose no risk to humans or ecosystems."

They recommend that manufacturers label any products containing antimicrobial chemicals so that consumers can avoid them, and they call for further research into the impacts of these compounds on us and our planet.

SECTIONS
BIG QUESTIONS
arrow
BIG QUESTIONS
WEATHER WATCH
BE THE CHANGE
JOB SECRETS
QUIZZES
WORLD WAR 1
SMART SHOPPING
STONES, BONES, & WRECKS
#TBT
THE PRESIDENTS
WORDS
RETROBITUARIES