CLOSE
Original image

My Trip to MIT's Sports Nerd Conference

Original image

[Image credit: John Marcus.]

On Saturday, I had the pleasure of attending the MIT Sloan Sports Analytics Conference—or the Sports Nerd Conference, as my girlfriend referred to it—at the Boston Convention and Exhibition Center. In its fourth year, the conference brought some of the sports industry's most innovative thinkers together for a forum on the expanding role of analytics in projecting player performance and informing in-game decision making.

The conference wasn't only for stat heads, however; it also featured panel discussions on such topics as international expansion, social media marketing, and the future of sports journalism. As a former psychology major who has only recently delved into the world of advanced analytics as they relate to sports (and only then in an attempt to gain an advantage in my fantasy drafts), this was refreshing. Here's a brief, stats-light summary of three of the three analytics-related panel discussions I observed.

Baseball Analytics

ESPN.com baseball writer Rob Neyer moderated a group that included three current front office executives (St. Louis Cardinals assistant general manager John Abbamondi, Arizona Diamondbacks director of baseball operations Shiraz Rehman, and Boston Red Sox advisor Tom Tippett), as well as former Red Sox general manager Dan Duqette, and John Dewan, who founded Baseball Info Solutions in 2002 after a career as an insurance actuary.

MIT-Sloan5

[Image credit: John Marcus.]

Neyer opened the discussion by referencing a phenomenon described by Wellesley political scientist Craig Murphy in a recent New Yorker profile on Paul Krugman. Murphy noted that sixteenth century maps of Africa were misleading, but they included pertinent information about the continent's interior, including the location of major rivers. As mapmaking became more accurate and cartographical standards for what information was included on a map rose, secondhand travelers' reports were discarded and lost. As a result, the maps included less information than before. By the nineteenth century, the maps were filled in again, but for a period the sharpening of technique caused loss as well as gain. Neyer used the example to illustrate the challenge facing today's baseball executives, who have more statistical information at their fingertips than ever before, but continue to struggle to make sense of it and use it effectively.

"There are so many teams that we meet with that don't understand how to use the data that's out there," said Dewan, who consults with several MLB clubs. Abbamondi indicated that knowing what stats not to look at it in terms of predictive value is just as important as knowing what stats are useful. That goes for information used by the front office to make personnel decisions and information that is passed on to players with the intent of giving them an edge. With a little research, anyone can discover what Albert Pujols' batting average is on Tuesdays with a 3-1 count on natural grass against a pitcher whose last name starts with the letter B. That may be interesting information to know—or not—but it probably won't affect how Pujols or Joe Blanton approach their next encounter on a Tuesday at Busch Stadium. "The last thing you want is the hitter's mind cluttered," Abbamondi said.

The panelists discussed defensive analytics at length, including the concept of catcher defense, which attempts to quantify a catcher's ability to block pitches and manage a game. Catcher defense helps explain why Jason Varitek, who is a poor fantasy option, is an underappreciated contributor to the success of the Red Sox. "Defensive evaluation is taking its proper place in overall player analysis," said Tippett, who provides analytical support for Red Sox general manager Theo Epstein.

Neyer asked the panelists what they would like to know about baseball that they don't already know. "How to make sure the Yankees never win another World Series," Tippett said, eliciting cheers from the Red Sox fans in the room. Duquette wanted to know how to produce 20-game winners. Rehman echoed something Abbamondi mentioned earlier in the discussion about finding an accurate way to measure a player's makeup or personality. To a scout, Abbamondi said, good makeup is often synonymous with politeness. If a player says "yes, sir" and "no, sir," the scout is more likely to report that the player has good makeup, even if this tells the front office nothing about that player's work ethic, desire, and motivation. As they continue to look for ways to identify the next superstar, teams are focused on finding predictive psychological measures for young players.

Emerging Analytics

New England Patriots head coach Bill Belichick's controversial decision to go for it on 4th and 2 from his own 28 with 2:08 remaining and his team nursing a six-point lead against the undefeated Colts last November was a hot topic on at least two panels, including this one, which was moderated by Philadelphia Inquirer reporter Kate Fagan.

Kevin Faulk was stopped short of the first-down marker after catching a pass in the flat from Tom Brady, allowing the Colts to take over on downs. Peyton Manning led his team to a game-winning touchdown and Belichick was criticized afterward. Aaron Schatz, a Brown graduate who wrote the Internet column "The Lycos 50" before working as a disc jockey and founding FootballOutsiders.com, a site that uses innovative statistics to analyze football, defended Belichick's call.

MIT-Sloan2

[Image credit: John Marcus.]

"Statistically, it was the right decision," said Schatz (pictured), who admitted he is a Patriots fans. While the statistical models used to come to that conclusion are not perfect, Schatz made the case that the decision was not stupid. But that's exactly how many media members reacted. As Schatz pointed out, broadcasters referred to subsequent decisions by NFL head coaches that they perceived as boneheaded as "Belichickian" for the remainder of the season.

Schatz and San Francisco 49ers Executive VP of Football and Business Operations Paraag Marathe had some interesting things to say about the NFL scouting combine. Marathe compared evaluating rookie football players by having them "play track and field" at the scouting combine to evaluating rookie baseball players by having them play ping-pong. Marathe and Schatz both emphasized the importance of evaluating players in the context of the scheme that they play in and the abilities of the players around them. Football analytics has lagged behind baseball analytics, they said, in part because it is inherently more difficult to evaluate one player's ability without accounting for what the 10 other players on his team did on a given play. If a running back breaks a 25-yard run, for instance, was it because he made a great cut, his fullback made a great block, or his offensive line cleared a huge hole? Perhaps it was for all three reasons.

Like the baseball executives who spoke before him, Marathe discussed the growing emphasis being placed on measuring players' personality traits. Marathe and Schatz said a poor score on the infamous Wonderlic test administered to prospects at the NFL's scouting combine might raise a red flag for teams—if only because it could indicate that the player doesn't take his draft prospects serious enough to find someone to help him prepare for the test—but that psychological traits related to dedication, motivation, and self-efficacy are more predictive of future success.

What Geeks Don't Get: The Limits of Moneyball

Michael Lewis, who wrote Moneyball and The Blind Side, moderated the feature panel, which featured ESPN.com columnist Bill Simmons, Dallas Mavericks owner Mark Cuban, Houston Rockets general manager Daryl Morey, Indianapolis Colts general manager Bill Polian, and New England Patriots president Jonathan Kraft. After Lewis introduced the panel, Simmons congratulated the audience of more than 1,000 on breaking the "Most Dudes in a Conference Room" record. He was only partly kidding.

MIT-Sloan3

[Image credit: John Marcus.]

The goal of the panel was to unmask some of the inefficiencies of sports analytics and identify how numbers don't always tell the whole story in sports. Simmons proceeded to explain that the onus should fall on the people dispersing all of this new statistical information to explain it in a way that the casual fan can understand. Polian, who has been to four Super Bowls as an executive with the Bills and Colts, said that geekdom provides wonderful tools for teams to find the next undervalued player, but requested that stat heads "speak English, please."

Belichick's decision came up again, with Polian, Kraft, and Simmons, who wrote a column criticizing the move, engaging in a fascinating back-and-forth. Kraft said he was convinced it was two-down territory for the Patriots on third down and Polian indicated that he did, too, "without question." The Patriots were beat up defensively and the Colts had moved the ball at will in the second half, their thinking went. If Indianapolis got the ball back, they were going to score. Simmons said he thought the decision to go for it made sense, but that the events preceding the decision—calling a timeout after throwing an incompletion on third down—and the fourth-down play call didn't come from a position of strength. "It seemed panicky to me," Simmons said. "That's my opinion." "I disagree," Kraft said bluntly (pictured below, on left, with Polian and Simmons).

MIT-Kraft-Polian-Simmons

[Image credit: John Marcus.]

The discussion turned to basketball and it was no surprise that when asked to name the biggest inefficiencies in basketball, Cuban mentioned referees. Cuban and Morey stressed the importance of finding players with the right psychological makeup to complement their skills on the court, but they had differing opinions on the value of a player who the stats indicate performs well in the clutch. Cuban said that part of the reason the Mavericks traded for Jason Kidd was that, statistically, he performs better in clutch situations than at other points in the game. Morey expressed concern about the sample size for measuring a player's "clutchness," and said he didn't factor clutch statistics into his personnel decisions.

Toward the end of the session, Polian raised an important question: Once you identify a tendency using analytics, can you make it better? If you have the answer to that, or you have developed a personality test that can predict athletic success, there's a job in professional sports waiting for you.

Original image
iStock // Ekaterina Minaeva
technology
arrow
Man Buys Two Metric Tons of LEGO Bricks; Sorts Them Via Machine Learning
May 21, 2017
Original image
iStock // Ekaterina Minaeva

Jacques Mattheij made a small, but awesome, mistake. He went on eBay one evening and bid on a bunch of bulk LEGO brick auctions, then went to sleep. Upon waking, he discovered that he was the high bidder on many, and was now the proud owner of two tons of LEGO bricks. (This is about 4400 pounds.) He wrote, "[L]esson 1: if you win almost all bids you are bidding too high."

Mattheij had noticed that bulk, unsorted bricks sell for something like €10/kilogram, whereas sets are roughly €40/kg and rare parts go for up to €100/kg. Much of the value of the bricks is in their sorting. If he could reduce the entropy of these bins of unsorted bricks, he could make a tidy profit. While many people do this work by hand, the problem is enormous—just the kind of challenge for a computer. Mattheij writes:

There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth).

In the following months, Mattheij built a proof-of-concept sorting system using, of course, LEGO. He broke the problem down into a series of sub-problems (including "feeding LEGO reliably from a hopper is surprisingly hard," one of those facts of nature that will stymie even the best system design). After tinkering with the prototype at length, he expanded the system to a surprisingly complex system of conveyer belts (powered by a home treadmill), various pieces of cabinetry, and "copious quantities of crazy glue."

Here's a video showing the current system running at low speed:

The key part of the system was running the bricks past a camera paired with a computer running a neural net-based image classifier. That allows the computer (when sufficiently trained on brick images) to recognize bricks and thus categorize them by color, shape, or other parameters. Remember that as bricks pass by, they can be in any orientation, can be dirty, can even be stuck to other pieces. So having a flexible software system is key to recognizing—in a fraction of a second—what a given brick is, in order to sort it out. When a match is found, a jet of compressed air pops the piece off the conveyer belt and into a waiting bin.

After much experimentation, Mattheij rewrote the software (several times in fact) to accomplish a variety of basic tasks. At its core, the system takes images from a webcam and feeds them to a neural network to do the classification. Of course, the neural net needs to be "trained" by showing it lots of images, and telling it what those images represent. Mattheij's breakthrough was allowing the machine to effectively train itself, with guidance: Running pieces through allows the system to take its own photos, make a guess, and build on that guess. As long as Mattheij corrects the incorrect guesses, he ends up with a decent (and self-reinforcing) corpus of training data. As the machine continues running, it can rack up more training, allowing it to recognize a broad variety of pieces on the fly.

Here's another video, focusing on how the pieces move on conveyer belts (running at slow speed so puny humans can follow). You can also see the air jets in action:

In an email interview, Mattheij told Mental Floss that the system currently sorts LEGO bricks into more than 50 categories. It can also be run in a color-sorting mode to bin the parts across 12 color groups. (Thus at present you'd likely do a two-pass sort on the bricks: once for shape, then a separate pass for color.) He continues to refine the system, with a focus on making its recognition abilities faster. At some point down the line, he plans to make the software portion open source. You're on your own as far as building conveyer belts, bins, and so forth.

Check out Mattheij's writeup in two parts for more information. It starts with an overview of the story, followed up with a deep dive on the software. He's also tweeting about the project (among other things). And if you look around a bit, you'll find bulk LEGO brick auctions online—it's definitely a thing!

Original image
iStock
Animals
arrow
Scientists Think They Know How Whales Got So Big
May 24, 2017
Original image
iStock

It can be difficult to understand how enormous the blue whale—the largest animal to ever exist—really is. The mammal can measure up to 105 feet long, have a tongue that can weigh as much as an elephant, and have a massive, golf cart–sized heart powering a 200-ton frame. But while the blue whale might currently be the Andre the Giant of the sea, it wasn’t always so imposing.

For the majority of the 30 million years that baleen whales (the blue whale is one) have occupied the Earth, the mammals usually topped off at roughly 30 feet in length. It wasn’t until about 3 million years ago that the clade of whales experienced an evolutionary growth spurt, tripling in size. And scientists haven’t had any concrete idea why, Wired reports.

A study published in the journal Proceedings of the Royal Society B might help change that. Researchers examined fossil records and studied phylogenetic models (evolutionary relationships) among baleen whales, and found some evidence that climate change may have been the catalyst for turning the large animals into behemoths.

As the ice ages wore on and oceans were receiving nutrient-rich runoff, the whales encountered an increasing number of krill—the small, shrimp-like creatures that provided a food source—resulting from upwelling waters. The more they ate, the more they grew, and their bodies adapted over time. Their mouths grew larger and their fat stores increased, helping them to fuel longer migrations to additional food-enriched areas. Today blue whales eat up to four tons of krill every day.

If climate change set the ancestors of the blue whale on the path to its enormous size today, the study invites the question of what it might do to them in the future. Changes in ocean currents or temperature could alter the amount of available nutrients to whales, cutting off their food supply. With demand for whale oil in the 1900s having already dented their numbers, scientists are hoping that further shifts in their oceanic ecosystem won’t relegate them to history.

[h/t Wired]

SECTIONS
BIG QUESTIONS
BIG QUESTIONS
WEATHER WATCH
BE THE CHANGE
JOB SECRETS
QUIZZES
WORLD WAR 1
SMART SHOPPING
STONES, BONES, & WRECKS
#TBT
THE PRESIDENTS
WORDS
RETROBITUARIES