Why is Wikipedia Down?

If you're in the US you speak English and you visit Wikipedia today, you'll see a glimpse of the page you were trying to access, then you'll see the blackout message above. You'll see similar messages from Google, Reddit, and others. Why? Although each of these sites does a good job of explaining its position, here's a rundown of the issues involved in today's shutdown.

An Extremely Brief History Lesson

In 1998, the US Congress passed the Digital Millennium Copyright Act (DMCA), a bill that, among other things, created a simple system by which copyright owners could request that infringing content be removed from websites hosted in the US. In short, the copyright owner tells the website, "I own this video/photo/song and you have to remove it, or I'll sue you." Then it gets removed, or they go to court. Have you ever seen a YouTube video that's replaced with the red sad-face? That's a DMCA takedown. This system has worked reasonably well for the past decade and a half, although there have been abuses -- sometimes copyright owners request removal of content they don't actually own. But in general, the DMCA has become accepted in the tech community as a reasonable part of doing business online.

The Bills

The DMCA is not enough for some copyright holders, partly because it only applies to US-hosted websites. Citing questionable statistics about the impact of piracy, the Motion Picture Association of America, the US Chamber of Commerce, and others demanded more. That's how we get to the anti-piracy bills currently under consideration.

Two bills are currently being debated in the US Congress: the Stop Online Piracy Act (SOPA) in the House and the Protect IP Act (PIPA) in the Senate. Both were written to curtail online piracy of copyrighted content like movies and music; major backers of the bills include a laundry list of media companies -- in broad strokes, this is record labels and movie studios, though a few tech companies (most infamously GoDaddy) also support the bills. The bills are also heavily supported by the bipartisan committees considering them, although President Obama is against them.

Why today? There were supposed to be hearings on Capitol Hill today about SOPA. The various tech companies involved in today's protest timed their blackout to coincide with the hearings, to draw attention to them. But then the hearings were postponed...and the blackout continued anyway. Here's a snippet from a recent Ars Technica article:

Meanwhile, Rep. Darrell Issa (R-CA), a SOPA opponent, announced Saturday that he is postponing hearings on SOPA's DNS provisions that had been slated for Wednesday, January 18 before his House Oversight and Government Reform Committee.

"While I remain concerned about Senate action on the Protect IP Act, I am confident that flawed legislation will not be taken up by this House," Issa said. "Majority Leader Cantor has assured me that we will continue to work to address outstanding concerns and work to build consensus prior to any anti-piracy legislation coming before the House for a vote."

The net effect of this (likely as Issa intended) is even more attention to the issue, plus time for citizens to become aware of the issues and pressure their Congressional representatives. Indeed, Google and others are hosting petitions and encouraging US citizens to make a fuss.

What's the Problem? Piracy is Bad, Right?

The bills have unintended consequences far beyond stopping piracy -- at least, that's what the opposition says. The main objection is that the bills use a very blunt approach to "stopping piracy," which can be summarized as "breaking the internet" by blocking DNS access to domain names with any infringing content. What Wikipedia is saying (and for the record, I agree with them) is that the entire Wikipedia domain would be blocked for everyone, every time a piece of infringing content was found, under the provisions of SOPA -- and because anyone can post anything to Wikipedia, this could happen a lot. Wikipedia could also be prevented from receiving credit card payments -- which is actually important, because Wikipedia is funded by donations. This is a far cry from today's DMCA regime, where the actual infringing content is removed or a court case occurs, rather than blocking the entire website.

There are many excellent metaphors out there to describe what's wrong with the legislation's proposed techniques, but one written up by commenter TechBear on The Stranger's Blog is particularly easy to follow: "A friend of mine described the draconian measures to shut down internet providers as 'cracking down on mail fraud by arresting postal carriers.'" Indeed, DNS servers are like the mail carriers of the internet.

The other existential objection to SOPA/PIPA is the notion that it will quell new innovation online. In a SOPA/PIPA world, every new web service that allows users to post anything would have to live under constant threat of a shutdown. The logical outcome is you wouldn't build new stuff that let people post anything. If entrepreneurs are afraid to build new online services, our economy and our culture would be under threat.

Opponents of SOPA/PIPA tend to agree that piracy is bad. But they're saying these bills are the wrong way to stop piracy. In other words, stop the people committing mail fraud -- not the mail carriers. This video is a pretty good explanation of the situation:

What Do Supporters Say?

I know this is utterly reductionist, but the gist of it is "Nuh-uh." To be a little more fair, proponents of the bill (there are many in Congress and in industry -- and it's a very bipartisan group) say that they're trying to protect jobs and the economy. They suggest that piracy costs jobs, hurting the economy, and we must do something to stop it -- and specifically SOPA/PIPA is necessary to target not just domestic pirates, but international sites as well (this is code for "The Pirate Bay").

The New York Times ran this quote today:

“The bill will not harm Wikipedia, domestic blogs or social network sites,” said Representative Lamar Smith, Republican of Texas and a primary sponsor of the House bill.

Oh. I suppose Wikipedia, Google, Reddit, et al are all wrong about the effects of this legislation on their businesses.

What Happens Next?

We wait for a vote. Today, there are at least two Wikipedia pages that are still up -- those on SOPA and PIPA. Many opponents of SOPA/PIPA are promoting the OPEN Act instead. A list of striking sites is available from SOPA Strike. Read up on some more media coverage of the issue from The Week, or read my previous article on this stuff, What’s Wrong With PROTECT IP and SOPA?

Update, 20 January 2012: Just two days after the blackout protest, SOPA has been yanked and the PIPA vote has been postponed.

iStock // Ekaterina Minaeva
Man Buys Two Metric Tons of LEGO Bricks; Sorts Them Via Machine Learning
iStock // Ekaterina Minaeva

Jacques Mattheij made a small, but awesome, mistake. He went on eBay one evening and bid on a bunch of bulk LEGO brick auctions, then went to sleep. Upon waking, he discovered that he was the high bidder on many, and was now the proud owner of two tons of LEGO bricks. (This is about 4400 pounds.) He wrote, "[L]esson 1: if you win almost all bids you are bidding too high."

Mattheij had noticed that bulk, unsorted bricks sell for something like €10/kilogram, whereas sets are roughly €40/kg and rare parts go for up to €100/kg. Much of the value of the bricks is in their sorting. If he could reduce the entropy of these bins of unsorted bricks, he could make a tidy profit. While many people do this work by hand, the problem is enormous—just the kind of challenge for a computer. Mattheij writes:

There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth).

In the following months, Mattheij built a proof-of-concept sorting system using, of course, LEGO. He broke the problem down into a series of sub-problems (including "feeding LEGO reliably from a hopper is surprisingly hard," one of those facts of nature that will stymie even the best system design). After tinkering with the prototype at length, he expanded the system to a surprisingly complex system of conveyer belts (powered by a home treadmill), various pieces of cabinetry, and "copious quantities of crazy glue."

Here's a video showing the current system running at low speed:

The key part of the system was running the bricks past a camera paired with a computer running a neural net-based image classifier. That allows the computer (when sufficiently trained on brick images) to recognize bricks and thus categorize them by color, shape, or other parameters. Remember that as bricks pass by, they can be in any orientation, can be dirty, can even be stuck to other pieces. So having a flexible software system is key to recognizing—in a fraction of a second—what a given brick is, in order to sort it out. When a match is found, a jet of compressed air pops the piece off the conveyer belt and into a waiting bin.

After much experimentation, Mattheij rewrote the software (several times in fact) to accomplish a variety of basic tasks. At its core, the system takes images from a webcam and feeds them to a neural network to do the classification. Of course, the neural net needs to be "trained" by showing it lots of images, and telling it what those images represent. Mattheij's breakthrough was allowing the machine to effectively train itself, with guidance: Running pieces through allows the system to take its own photos, make a guess, and build on that guess. As long as Mattheij corrects the incorrect guesses, he ends up with a decent (and self-reinforcing) corpus of training data. As the machine continues running, it can rack up more training, allowing it to recognize a broad variety of pieces on the fly.

Here's another video, focusing on how the pieces move on conveyer belts (running at slow speed so puny humans can follow). You can also see the air jets in action:

In an email interview, Mattheij told Mental Floss that the system currently sorts LEGO bricks into more than 50 categories. It can also be run in a color-sorting mode to bin the parts across 12 color groups. (Thus at present you'd likely do a two-pass sort on the bricks: once for shape, then a separate pass for color.) He continues to refine the system, with a focus on making its recognition abilities faster. At some point down the line, he plans to make the software portion open source. You're on your own as far as building conveyer belts, bins, and so forth.

Check out Mattheij's writeup in two parts for more information. It starts with an overview of the story, followed up with a deep dive on the software. He's also tweeting about the project (among other things). And if you look around a bit, you'll find bulk LEGO brick auctions online—it's definitely a thing!

Cs California, Wikimedia Commons // CC BY-SA 3.0
How Experts Say We Should Stop a 'Zombie' Infection: Kill It With Fire
Cs California, Wikimedia Commons // CC BY-SA 3.0

Scientists are known for being pretty cautious people. But sometimes, even the most careful of us need to burn some things to the ground. Immunologists have proposed a plan to burn large swaths of parkland in an attempt to wipe out disease, as The New York Times reports. They described the problem in the journal Microbiology and Molecular Biology Reviews.

Chronic wasting disease (CWD) is a gruesome infection that’s been destroying deer and elk herds across North America. Like bovine spongiform encephalopathy (BSE, better known as mad cow disease) and Creutzfeldt-Jakob disease, CWD is caused by damaged, contagious little proteins called prions. Although it's been half a century since CWD was first discovered, scientists are still scratching their heads about how it works, how it spreads, and if, like BSE, it could someday infect humans.

Paper co-author Mark Zabel, of the Prion Research Center at Colorado State University, says animals with CWD fade away slowly at first, losing weight and starting to act kind of spacey. But "they’re not hard to pick out at the end stage," he told The New York Times. "They have a vacant stare, they have a stumbling gait, their heads are drooping, their ears are down, you can see thick saliva dripping from their mouths. It’s like a true zombie disease."

CWD has already been spotted in 24 U.S. states. Some herds are already 50 percent infected, and that number is only growing.

Prion illnesses often travel from one infected individual to another, but CWD’s expansion was so rapid that scientists began to suspect it had more than one way of finding new animals to attack.

Sure enough, it did. As it turns out, the CWD prion doesn’t go down with its host-animal ship. Infected animals shed the prion in their urine, feces, and drool. Long after the sick deer has died, others can still contract CWD from the leaves they eat and the grass in which they stand.

As if that’s not bad enough, CWD has another trick up its sleeve: spontaneous generation. That is, it doesn’t take much damage to twist a healthy prion into a zombifying pathogen. The illness just pops up.

There are some treatments, including immersing infected tissue in an ozone bath. But that won't help when the problem is literally smeared across the landscape. "You cannot treat half of the continental United States with ozone," Zabel said.

And so, to combat this many-pronged assault on our wildlife, Zabel and his colleagues are getting aggressive. They recommend a controlled burn of infected areas of national parks in Colorado and Arkansas—a pilot study to determine if fire will be enough.

"If you eliminate the plants that have prions on the surface, that would be a huge step forward," he said. "I really don’t think it’s that crazy."

[h/t The New York Times]